AI Models

358 models Free & Paid 更新: 27 minutes trước

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

by |可能 2026 |256K context |$1.00/M input |$2.00/M output
256K tokens

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

by |可能 2026 |1M context |$1.50/M input |$9.00/M output
1M tokens

Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

by |可能 2026 |1M context |$30.00/M input |$150.00/M output
1M tokens

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding...

by |可能 2026 |33K context |$0.1500/M input |$1.50/M output

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...

by |可能 2026 |262K context |$0.0750/M input |$0.6250/M output
262K tokens

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, 视频, audio, and PDF inputs, and is designed for lightweight agentic...

by |可能 2026 |1M context |$0.2500/M input |$1.50/M output
1M tokens

CoBuddy is a code generation model from Baidu, optimized for coding tasks and AI Agent workflows. It features high inference throughput and low end-to-end latency, with native support for tool...

by |可能 2026 |131K context |Miễn phí input |Miễn phí output
131K tokens

GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always resolves to the latest Instant chat model used in ChatGPT. As OpenAI rolls out new Instant model updates...

by |可能 2026 |400K context |$5.00/M input |$30.00/M output
400K tokens

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...

by |4月 2026 |1M context |$1.25/M input |$2.50/M output
1M tokens

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...

by |4月 2026 |131K context |$0.0500/M input |$0.1000/M output
131K tokens

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and is designed for agentic workflows, coding, and complex...

by |4月 2026 |262K context |$1.50/M input |$7.50/M output
262K tokens

Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in code generation, automated workflows, and complex instruction execution....

by |4月 2026 |1M context |Miễn phí input |Miễn phí output
1M tokens

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It accepts text, image, 视频, and...

by |4月 2026 |256K context |Miễn phí input |Miễn phí output
256K tokens

Laguna XS.2 is the second-generation model in the XS size class from [Poolside](https://poolside.ai), their efficient coding agent series. It combines tool calling and reasoning capabilities with a compact footprint, offering...

by |4月 2026 |131K context |Miễn phí input |Miễn phí output
131K tokens

Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 128K...

by |4月 2026 |131K context |Miễn phí input |Miễn phí output
131K tokens

This model always redirects to the latest model in the Anthropic Claude Haiku family.

by |4月 2026 |200K context |$1.00/M input |$5.00/M output
200K tokens

This model always redirects to the latest model in the OpenAI GPT Mini family.

by |4月 2026 |400K context |$0.7500/M input |$4.50/M output
400K tokens

This model always redirects to the latest model in the Google Gemini Pro family.

by |4月 2026 |1M context |$2.00/M input |$12.00/M output
1M tokens

This model always redirects to the latest model in the MoonshotAI Kimi family.

by |4月 2026 |262K context |$0.7300/M input |$3.49/M output
262K tokens

This model always redirects to the latest model in the Google Gemini Flash family.

by |4月 2026 |1M context |$1.50/M input |$9.00/M output
1M tokens

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

by |4月 2026 |1M context |$3.00/M input |$15.00/M output
1M tokens

This model always redirects to the latest model in the OpenAI GPT family.

by |4月 2026 |1.1M context |$5.00/M input |$30.00/M output
1.1M tokens

Qwen3.5 Plus (四月 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M token context window. This...

by |4月 2026 |1M context |$0.3000/M input |$1.80/M output
1M tokens

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tiered pricing kicks in...

by |4月 2026 |1M context |$0.1875/M input |$1.13/M output
1M tokens

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture combining Gated...

by |4月 2026 |262K context |$0.1490/M input |$1.00/M output
262K tokens

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters. It is optimized for agentic coding, tool use, and...

by |4月 2026 |262K context |$1.04/M input |$6.24/M output
262K tokens

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities — accepting text, image, and video inputs...

by |4月 2026 |262K context |$0.3200/M input |$3.20/M output
262K tokens

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

by |4月 2026 |1.1M context |$30.00/M input |$180.00/M output
1.1M tokens

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

by |4月 2026 |1.1M context |$5.00/M input |$30.00/M output
1.1M tokens

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

by |4月 2026 |1M context |$0.4350/M input |$0.8700/M output
1M tokens

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

by |4月 2026 |1M context |$0.1120/M input |$0.2240/M output
1M tokens

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

by |4月 2026 |1M context |Miễn phí input |Miễn phí output
1M tokens

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale. It uses a “fast...

by |4月 2026 |262K context |$0.0750/M input |$0.6250/M output
262K tokens

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...

by |4月 2026 |262K context |$0.0660/M input |$0.2600/M output
262K tokens

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

by |4月 2026 |1M context |$1.00/M input |$3.00/M output
1M tokens

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

by |4月 2026 |1M context |$0.4000/M input |$2.00/M output
1M tokens

[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

by |4月 2026 |272K context |$8.00/M input |$15.00/M output
272K tokens

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....

by |4月 2026 |262K context |$0.0100/M input |$0.0300/M output
262K tokens

This model always redirects to the latest model in the Claude Opus family.

by |4月 2026 |1M context |$5.00/M input |$25.00/M output
1M tokens

The Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artificial Analysis](https://artificialanalysis.ai/) coding percentiles. Set min_coding_score between 0 和 1 on the [pareto-router plugin](https://openrouter.ai/docs/guides/routing/routers/pareto-router#the-min_coding_score-parameter) to control how...

by |4月 2026 |2M context |Miễn phí input |Miễn phí output
2M tokens

Qianfan-OCR-Fast is a domain-specific multimodal large model purpose-built for OCR. By leveraging specialized OCR training data while preserving versatile multimodal intelligence, it provides a powerful performance upgrade over Qianfan-OCR.

by |4月 2026 |66K context |$0.6800/M input |$2.81/M output
66K tokens

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...

by |4月 2026 |262K context |$0.7300/M input |$3.49/M output
262K tokens

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...

by |4月 2026 |1M context |$5.00/M input |$25.00/M output
1M tokens

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

by |4月 2026 |1M context |$30.00/M input |$150.00/M output
1M tokens

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...

by |4月 2026 |203K context |Miễn phí input |Miễn phí output
203K tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

by |4月 2026 |262K context |$0.0600/M input |$0.3300/M output
262K tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

by |4月 2026 |262K context |Miễn phí input |Miễn phí output
262K tokens

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

by |4月 2026 |262K context |$0.1200/M input |$0.3700/M output
262K tokens

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

by |4月 2026 |262K context |Miễn phí input |Miễn phí output
262K tokens

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...

by |4月 2026 |1M context |$0.3250/M input |$1.95/M output
1M tokens