AI Models

356 모델 무료 & Paid Cập nhật: 14 hours trước

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...

by |Oct 2025 |131K context |$0.0800/M input |$0.3500/M output
131K tokens

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text...

by |Oct 2025 |400K context |$2.50/M input |$2.00/M output
400K tokens

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance...

by |Oct 2025 |200K context |$1.00/M input |$5.00/M output
200K tokens

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...

by |Oct 2025 |256K context |$0.1170/M input |$1.37/M output
256K tokens

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

by |Oct 2025 |256K context |$0.0800/M input |$0.5000/M output
256K tokens

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities. It offers major improvements in reasoning, code quality, and user experience while incorporating GPT Image 1's superior instruction following,...

by |Oct 2025 |400K context |$10.00/M input |$10.00/M output
400K tokens

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.

by |Oct 2025 |200K context |$10.00/M input |$40.00/M output
200K tokens

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.

by |Oct 2025 |200K context |$2.00/M input |$8.00/M output
200K tokens

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, 과학, and...

by |Oct 2025 |131K context |$0.1000/M input |$0.4000/M output
131K tokens

ERNIE-4.5-21B-A3B-Thinking is Baidu's upgraded lightweight MoE model, refined to boost reasoning depth and quality for top-tier performance in logical puzzles, math, 과학, 코딩, text generation, and expert-level academic benchmarks.

by |Oct 2025 |131K context |$0.0700/M input |$0.2800/M output
131K tokens

쌍둥이자리 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation,...

by |Oct 2025 |33K context |$0.3000/M input |$2.50/M output

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels...

by |Oct 2025 |131K context |$0.1300/M input |$1.56/M output
131K tokens

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...

by |Oct 2025 |262K context |$0.1300/M input |$0.5200/M output
262K tokens

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. 단계별 추론이 필요한 복잡한 작업에 최적화되어 있습니다., 다음 지시, and...

by |Oct 2025 |400K context |$15.00/M input |$120.00/M output
400K tokens

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...

by |9 월 2025 |203K context |$0.4300/M input |$1.74/M output
203K tokens

클로드 소네트 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with...

by |9 월 2025 |1M context |$3.00/M input |$15.00/M output
1M tokens

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

by |9 월 2025 |164K context |$0.2700/M input |$0.4100/M output
164K tokens

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

by |9 월 2025 |131K context |$0.3000/M input |$0.5000/M output
131K tokens

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, 클로드, and others into your files at...

by |9 월 2025 |256K context |$0.8500/M input |$1.25/M output
256K tokens

쌍둥이자리 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 가족, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

by |9 월 2025 |1M context |$0.1000/M input |$0.4000/M output
1M tokens

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STEM and math....

by |9 월 2025 |131K context |$0.2600/M input |$2.60/M output
131K tokens

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...

by |9 월 2025 |262K context |$0.2000/M input |$0.8800/M output
262K tokens

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, 다음 지시, multilingual support, and long-tail knowledge coverage compared to the January 2025 버전. It...

by |9 월 2025 |262K context |$0.7800/M input |$3.90/M output
262K tokens

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...

by |9 월 2025 |1M context |$0.6500/M input |$3.25/M output
1M tokens

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....

by |9 월 2025 |400K context |$1.25/M input |$10.00/M output
400K tokens

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's...

by |9 월 2025 |164K context |$0.2700/M input |$0.9500/M output
164K tokens

Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks...

by |9 월 2025 |131K context |$0.0900/M input |$0.4500/M output
131K tokens

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling...

by |9 월 2025 |1M context |$0.1950/M input |$0.9750/M output
1M tokens

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code synthesis/debugging, logic, and agentic...

by |9 월 2025 |262K context |$0.0975/M input |$0.7800/M output
262K tokens

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...

by |9 월 2025 |262K context |$0.0900/M input |$1.10/M output
262K tokens

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...

by |9 월 2025 |262K context |Miễn phí input |Miễn phí output
262K tokens

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

by |9 월 2025 |1M context |$0.2600/M input |$0.7800/M output
1M tokens

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

by |9 월 2025 |1M context |$0.2600/M input |$0.7800/M output
1M tokens

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...

by |9 월 2025 |128K context |Miễn phí input |Miễn phí output
128K tokens

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...

by |9 월 2025 |131K context |$0.0400/M input |$0.1600/M output
131K tokens

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32...

by |9 월 2025 |262K context |$0.6000/M input |$2.50/M output
262K tokens

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking mode,” where internal reasoning traces are separated...

by |Aug 2025 |131K context |$0.0800/M input |$0.4000/M output
131K tokens

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...

by |Aug 2025 |131K context |$0.1300/M input |$0.4000/M output
131K tokens

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...

by |Aug 2025 |131K context |$1.00/M input |$3.00/M output
131K tokens

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...

by |Aug 2025 |164K context |$0.2100/M input |$0.7900/M output
164K tokens

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs...

by |Aug 2025 |128K context |$2.50/M input |$10.00/M output
128K tokens

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...

by |Aug 2025 |131K context |$0.4000/M input |$2.00/M output
131K tokens

A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B activated per token, delivering exceptional multimodal understanding and generation through heterogeneous MoE structures and modality-isolated routing. Supporting an...

by |Aug 2025 |131K context |$0.0700/M input |$0.2800/M output
131K tokens

A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activated per token, delivering exceptional text and vision understanding through its innovative heterogeneous MoE structure with modality-isolated routing....

by |Aug 2025 |131K context |$0.1400/M input |$0.5600/M output
131K tokens

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding,...

by |Aug 2025 |66K context |$0.6000/M input |$1.80/M output
66K tokens

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...

by |Aug 2025 |256K context |$2.00/M input |$8.00/M output
256K tokens

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

by |Aug 2025 |128K context |$1.25/M input |$10.00/M output
128K tokens

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. 단계별 추론이 필요한 복잡한 작업에 최적화되어 있습니다., 다음 지시, and accuracy...

by |Aug 2025 |400K context |$1.25/M input |$10.00/M output
400K tokens

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost....

by |Aug 2025 |400K context |$0.2500/M input |$2.00/M output
400K tokens

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...

by |Aug 2025 |400K context |$0.0500/M input |$0.4000/M output
400K tokens