AI Models

340 models 무료 & Paid Cập nhật: 7 hours trước

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...

~에 의해 |8월 2025 |256K context |$2.00/M input |$8.00/M output
256K tokens

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

~에 의해 |8월 2025 |128K context |$1.25/M input |$10.00/M output
128K tokens

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy...

~에 의해 |8월 2025 |400K context |$1.25/M input |$10.00/M output
400K tokens

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost....

~에 의해 |8월 2025 |400K context |$0.2500/M input |$2.00/M output
400K tokens

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...

~에 의해 |8월 2025 |400K context |$0.0500/M input |$0.4000/M output
400K tokens

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized...

~에 의해 |8월 2025 |131K context |Miễn phí input |Miễn phí output
131K tokens

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized...

~에 의해 |8월 2025 |131K context |$0.0300/M input |$0.1500/M output
131K tokens

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for...

~에 의해 |8월 2025 |131K context |$0.0290/M input |$0.1400/M output
131K tokens

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for...

~에 의해 |8월 2025 |131K context |Miễn phí input |Miễn phí output
131K tokens

직장 마감 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains...

~에 의해 |8월 2025 |200K context |$15.00/M input |$75.00/M output
200K tokens

Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test generation. [Blog Post](https://mistral.ai/news/codestral-25-08)

~에 의해 |8월 2025 |256K context |$0.3000/M input |$0.9000/M output
256K tokens

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...

~에 의해 |Jul 2025 |160K context |$0.0700/M input |$0.2700/M output
160K tokens

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thinking mode and is designed for high-quality instruction following, multilingual understanding, and...

~에 의해 |Jul 2025 |131K context |$0.0482/M input |$0.1931/M output
131K tokens

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly...

~에 의해 |Jul 2025 |131K context |$0.6000/M input |$2.20/M output
131K tokens

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...

~에 의해 |Jul 2025 |131K context |$0.1300/M input |$0.8500/M output
131K tokens

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144...

~에 의해 |Jul 2025 |262K context |$0.1495/M input |$1.50/M output
262K tokens

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, 도구 사용, and long-context reasoning over...

~에 의해 |Jul 2025 |1M context |Miễn phí input |Miễn phí output
1M tokens

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, 도구 사용, and long-context reasoning over...

~에 의해 |Jul 2025 |1M context |$0.2200/M input |$1.80/M output
1M tokens

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

~에 의해 |Jul 2025 |128K context |$0.1000/M input |$0.2000/M output
128K tokens

쌍둥이자리 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 가족, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

~에 의해 |Jul 2025 |1M context |$0.1000/M input |$0.4000/M output
1M tokens

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...

~에 의해 |Jul 2025 |262K context |$0.0900/M input |$0.1000/M output
262K tokens

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...

~에 의해 |Jul 2025 |131K context |$0.8500/M input |$3.40/M output
131K tokens

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...

~에 의해 |Jul 2025 |131K context |$0.5700/M input |$2.30/M output
131K tokens

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving...

~에 의해 |Jul 2025 |33K context |Miễn phí input |Miễn phí output

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark...

~에 의해 |Jul 2025 |131K context |$0.1400/M input |$0.5700/M output
131K tokens

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transformations. The model requires the prompt to be in the following format: {instruction} {initial_code}...

~에 의해 |Jul 2025 |262K context |$0.9000/M input |$1.90/M output
262K tokens

Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The model requires the prompt to be in the following format: {instruction} {initial_code} {edit_snippet}...

~에 의해 |Jul 2025 |82K context |$0.8000/M input |$1.20/M output
82K tokens

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 시리즈, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...

~에 의해 |6 월 2025 |131K context |$0.4200/M input |$1.25/M output
131K tokens

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, 버전 3.2 significantly improves accuracy on...

~에 의해 |6 월 2025 |128K context |$0.0750/M input |$0.2000/M output
128K tokens

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it...

~에 의해 |6 월 2025 |1M context |$0.4000/M input |$2.20/M output
1M tokens

쌍둥이자리 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, 코딩, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

~에 의해 |6 월 2025 |1M context |$0.3000/M input |$2.50/M output
1M tokens

쌍둥이자리 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, 코딩, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

~에 의해 |6 월 2025 |1M context |$1.25/M input |$10.00/M output
1M tokens

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently...

~에 의해 |6 월 2025 |200K context |$20.00/M input |$80.00/M output
200K tokens

쌍둥이자리 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, 코딩, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

~에 의해 |6 월 2025 |1M context |$1.25/M input |$10.00/M output
1M tokens

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...

~에 의해 |5월 2025 |164K context |$0.5000/M input |$2.15/M output
164K tokens

직장 마감 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...

~에 의해 |5월 2025 |200K context |$15.00/M input |$75.00/M output
200K tokens

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...

~에 의해 |5월 2025 |1M context |$3.00/M input |$15.00/M output
1M tokens

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...

~에 의해 |5월 2025 |33K context |$0.0600/M input |$0.1200/M output

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning and multimodal performance with 8× lower cost...

~에 의해 |5월 2025 |131K context |$0.4000/M input |$2.00/M output
131K tokens

쌍둥이자리 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, 코딩, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

~에 의해 |5월 2025 |1M context |$1.25/M input |$10.00/M output
1M tokens

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...

~에 의해 |5월 2025 |131K context |$0.7500/M input |$1.20/M output
131K tokens

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...

~에 의해 |5월 2025 |33K context |$0.5000/M input |$0.8000/M output

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

~에 의해 |4월 2025 |164K context |$0.1800/M input |$0.1800/M output
164K tokens

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique...

~에 의해 |4월 2025 |131K context |$0.1200/M input |$0.5000/M output
131K tokens

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...

~에 의해 |4월 2025 |131K context |$0.1170/M input |$0.4550/M output
131K tokens

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

~에 의해 |4월 2025 |132K context |$0.1000/M input |$0.2400/M output
132K tokens

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

~에 의해 |4월 2025 |131K context |$0.0800/M input |$0.2800/M output
131K tokens

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and...

~에 의해 |4월 2025 |131K context |$0.4550/M input |$1.82/M output
131K tokens

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining...

~에 의해 |4월 2025 |200K context |$1.10/M input |$4.40/M output
200K tokens

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, 코딩, and visual reasoning tasks. It also excels at technical writing and instruction-following....

~에 의해 |4월 2025 |200K context |$2.00/M input |$8.00/M output
200K tokens