AI Models

340 models Free & Paid Cập nhật: 5 giờ trước

OpenAI: GPT-5.3-Codex

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...

by openai |Th2 2026 |400K context |$1.75/M input |$14.00/M output

400K tokens ⓘ

AionLabs: Aion-2.0

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....

by aion-labs |Th2 2026 |131K context |$0.8000/M input |$1.60/M output

131K tokens ⓘ

Google: Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

by google |Th2 2026 |1M context |$2.00/M input |$12.00/M output

1M tokens ⓘ

Anthropic: Claude Sonnet 4.6

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...

by anthropic |Th2 2026 |1M context |$3.00/M input |$15.00/M output

1M tokens ⓘ

Qwen: Qwen3.5 Plus 2026-02-15

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...

by qwen |Th2 2026 |1M context |$0.2600/M input |$1.56/M output

1M tokens ⓘ

Qwen: Qwen3.5 397B A17B

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...

by qwen |Th2 2026 |256K context |$0.3850/M input |$2.45/M output

256K tokens ⓘ

MiniMax: MiniMax M2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...

by minimax |Th2 2026 |205K context |$0.1200/M input |$0.4800/M output

205K tokens ⓘ

Z.ai: GLM 5

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading...

by z-ai |Th2 2026 |203K context |$0.6000/M input |$1.92/M output

203K tokens ⓘ

Qwen: Qwen3 Max Thinking

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...

by qwen |Th2 2026 |262K context |$0.7800/M input |$3.90/M output

262K tokens ⓘ

Anthropic: Claude Opus 4.6

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective...

by anthropic |Th2 2026 |1M context |$5.00/M input |$25.00/M output

1M tokens ⓘ

Qwen: Qwen3 Coder Next

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per...

by qwen |Th2 2026 |262K context |$0.1100/M input |$0.8000/M output

262K tokens ⓘ

Free Models Router

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that...

by openrouter |Th2 2026 |200K context |Miễn phí input |Miễn phí output

200K tokens ⓘ

StepFun: Step 3.5 Flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token....

by stepfun |Th1 2026 |262K context |$0.1000/M input |$0.3000/M output

262K tokens ⓘ

MoonshotAI: Kimi K2.5

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...

by moonshotai |Th1 2026 |262K context |$0.3750/M input |$2.03/M output

262K tokens ⓘ

Upstage: Solar Pro 3

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized...

by upstage |Th1 2026 |128K context |$0.1500/M input |$0.6000/M output

128K tokens ⓘ

MiniMax: MiniMax M2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...

by minimax |Th1 2026 |66K context |$0.3000/M input |$1.20/M output

66K tokens ⓘ

Writer: Palmyra X5

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...

by writer |Th1 2026 |1M context |$0.6000/M input |$6.00/M output

1M tokens ⓘ

LiquidAI: LFM2.5-1.2B-Thinking (free)

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...

by liquid |Th1 2026 |33K context |Miễn phí input |Miễn phí output

LiquidAI: LFM2.5-1.2B-Instruct (free)

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

by liquid |Th1 2026 |33K context |Miễn phí input |Miễn phí output

OpenAI: GPT Audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...

by openai |Th1 2026 |128K context |$2.50/M input |$10.00/M output

128K tokens ⓘ

OpenAI: GPT Audio Mini

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...

by openai |Th1 2026 |128K context |$0.6000/M input |$2.40/M output

128K tokens ⓘ

Z.ai: GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

by z-ai |Th1 2026 |203K context |$0.0600/M input |$0.4000/M output

203K tokens ⓘ

OpenAI: GPT-5.2-Codex

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....

by openai |Th1 2026 |400K context |$1.75/M input |$14.00/M output

400K tokens ⓘ

ByteDance Seed: Seed 1.6 Flash

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of...

by bytedance-seed |Th12 2025 |262K context |$0.0750/M input |$0.3000/M output

262K tokens ⓘ

ByteDance Seed: Seed 1.6

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

by bytedance-seed |Th12 2025 |262K context |$0.2500/M input |$2.00/M output

262K tokens ⓘ

MiniMax: MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...

by minimax |Th12 2025 |205K context |$0.3000/M input |$1.20/M output

205K tokens ⓘ

Z.ai: GLM 4.7

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while...

by z-ai |Th12 2025 |203K context |$0.4000/M input |$1.75/M output

203K tokens ⓘ

Google: Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...

by google |Th12 2025 |1M context |$0.5000/M input |$3.00/M output

1M tokens ⓘ

NVIDIA: Nemotron 3 Nano 30B A3B (free)

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

by nvidia |Th12 2025 |256K context |Miễn phí input |Miễn phí output

256K tokens ⓘ

NVIDIA: Nemotron 3 Nano 30B A3B

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

by nvidia |Th12 2025 |262K context |$0.0500/M input |$0.2000/M output

262K tokens ⓘ

OpenAI: GPT-5.2 Chat

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

by openai |Th12 2025 |128K context |$1.75/M input |$14.00/M output

128K tokens ⓘ

OpenAI: GPT-5.2 Pro

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning,...

by openai |Th12 2025 |400K context |$21.00/M input |$168.00/M output

400K tokens ⓘ

OpenAI: GPT-5.2

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically, responding quickly...

by openai |Th12 2025 |400K context |$1.75/M input |$14.00/M output

400K tokens ⓘ

Mistral: Devstral 2 2512

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring...

by mistralai |Th12 2025 |262K context |$0.4000/M input |$2.00/M output

262K tokens ⓘ

Relace: Relace Search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...

by relace |Th12 2025 |256K context |$1.00/M input |$3.00/M output

256K tokens ⓘ

Z.ai: GLM 4.6V

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts...

by z-ai |Th12 2025 |131K context |$0.3000/M input |$0.9000/M output

131K tokens ⓘ

Body Builder (beta)

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example:...

by openrouter |Th12 2025 |128K context |Miễn phí input |Miễn phí output

128K tokens ⓘ

OpenAI: GPT-5.1-Codex-Max

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic...

by openai |Th12 2025 |400K context |$1.25/M input |$10.00/M output

400K tokens ⓘ

Amazon: Nova 2 Lite

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing...

by amazon |Th12 2025 |1M context |$0.3000/M input |$2.50/M output

1M tokens ⓘ

Mistral: Ministral 3 14B 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language...

by mistralai |Th12 2025 |262K context |$0.2000/M input |$0.2000/M output

262K tokens ⓘ

Mistral: Ministral 3 8B 2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

by mistralai |Th12 2025 |262K context |$0.1500/M input |$0.1500/M output

262K tokens ⓘ

Mistral: Ministral 3 3B 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

by mistralai |Th12 2025 |131K context |$0.1000/M input |$0.1000/M output

131K tokens ⓘ

Mistral: Mistral Large 3 2512

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and released under the Apache 2.0 license.

by mistralai |Th12 2025 |262K context |$0.5000/M input |$1.50/M output

262K tokens ⓘ

Arcee AI: Trinity Mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...

by arcee-ai |Th12 2025 |131K context |$0.0450/M input |$0.1500/M output

131K tokens ⓘ

DeepSeek: DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

by deepseek |Th12 2025 |131K context |$0.2288/M input |$0.3432/M output

131K tokens ⓘ

Anthropic: Claude Opus 4.5

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and...

by anthropic |Th11 2025 |200K context |$5.00/M input |$25.00/M output

200K tokens ⓘ

AllenAI: Olmo 3 32B Think

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and...

by allenai |Th11 2025 |66K context |$0.1500/M input |$0.5000/M output

66K tokens ⓘ

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and...

by google |Th11 2025 |66K context |$2.00/M input |$12.00/M output

66K tokens ⓘ

Deep Cogito: Cogito v2.1 671B

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning...

by deepcogito |Th11 2025 |128K context |$1.25/M input |$1.25/M output

128K tokens ⓘ

OpenAI: GPT-5.1

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning...

by openai |Th11 2025 |400K context |$1.25/M input |$10.00/M output

400K tokens ⓘ

AI Models

Tài khoản

🔑 Lấy lại mật khẩu