AI Models

356 モデル Free & Paid Cập nhật: 16 hours trước

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7...

による |4月 2026 |262K コンテキスト |Miễn phí input |Miễn phí output
262K トークン

グロク 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

による |3月 2026 |2M コンテキスト |$2.00/M入力 |$6.00/M出力
2M トークン

グロク 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering...

による |3月 2026 |2M コンテキスト |$1.25/M入力 |$2.50/M出力
2M トークン

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz...

による |3月 2026 |1M コンテキスト |Miễn phí input |Miễn phí output
1M トークン

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...

による |3月 2026 |1M コンテキスト |Miễn phí input |Miễn phí output
1M トークン

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...

による |3月 2026 |256K コンテキスト |$0.3000/M入力 |$1.20/M出力
256K トークン

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...

による |3月 2026 |16K コンテキスト |$0.1000/M入力 |$0.1000/M出力

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, ビデオ, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

による |3月 2026 |262K コンテキスト |$0.4000/M入力 |$2.00/M出力
262K トークン

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like...

による |3月 2026 |1M コンテキスト |$1.00/M入力 |$3.00/M出力
1M トークン

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...

による |3月 2026 |205K コンテキスト |$0.2790/M入力 |$1.20/M出力
205K トークン

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency...

による |3月 2026 |400K コンテキスト |$0.2000/M入力 |$1.25/M出力
400K トークン

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, コーディング,...

による |3月 2026 |400K コンテキスト |$0.7500/M入力 |$4.50/M出力
400K トークン

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It combines strong reasoning from...

による |3月 2026 |262K コンテキスト |$0.1500/M入力 |$0.6000/M出力
262K トークン

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows...

による |3月 2026 |203K コンテキスト |$1.20/M入力 |$4.00/M出力
203K トークン

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

による |3月 2026 |1M コンテキスト |Miễn phí input |Miễn phí output
1M トークン

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

による |3月 2026 |1M コンテキスト |$0.0900/M入力 |$0.4500/M出力
1M トークン

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across...

による |3月 2026 |262K コンテキスト |$0.2500/M入力 |$2.00/M出力
262K トークン

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, コーディング, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

による |3月 2026 |262K コンテキスト |$0.0400/M入力 |$0.1500/M出力
262K トークン

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K...

による |3月 2026 |1.1M コンテキスト |$30.00/M入力 |$180.00/M出力
1.1M トークン

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...

による |3月 2026 |1.1M コンテキスト |$2.50/M入力 |$15.00/M出力
1.1M トークン

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...

による |3月 2026 |128K コンテキスト |$0.2500/M入力 |$0.7500/M出力
128K トークン

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly...

による |3月 2026 |128K コンテキスト |$1.75/M入力 |$14.00/M出力
128K トークン

ジェミニ 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across...

による |3月 2026 |1M コンテキスト |$0.2500/M入力 |$1.50/M出力
1M トークン

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/medium/high), multimodal understanding,...

による |2月 2026 |262K コンテキスト |$0.1000/M入力 |$0.4000/M出力
262K トークン

ジェミニ 3.1 Flash Image Preview, 別名. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines...

による |2月 2026 |131K コンテキスト |$0.5000/M入力 |$3.00/M出力
131K トークン

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...

による |2月 2026 |262K コンテキスト |$0.1400/M入力 |$1.00/M出力
262K トークン

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of...

による |2月 2026 |262K コンテキスト |$0.1950/M入力 |$1.56/M出力
262K トークン

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...

による |2月 2026 |262K コンテキスト |$0.2600/M入力 |$2.08/M出力
262K トークン

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

による |2月 2026 |1M コンテキスト |$0.0650/M入力 |$0.2600/M出力
1M トークン

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...

による |2月 2026 |128K コンテキスト |$0.0300/M入力 |$0.1200/M出力
128K トークン

ジェミニ 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...

による |2月 2026 |1M コンテキスト |$2.00/M入力 |$12.00/M出力
1M トークン

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...

による |2月 2026 |400K コンテキスト |$1.75/M入力 |$14.00/M出力
400K トークン

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....

による |2月 2026 |131K コンテキスト |$0.8000/M入力 |$1.60/M出力
131K トークン

ジェミニ 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

による |2月 2026 |1M コンテキスト |$2.00/M入力 |$12.00/M出力
1M トークン

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...

による |2月 2026 |1M コンテキスト |$3.00/M入力 |$15.00/M出力
1M トークン

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...

による |2月 2026 |1M コンテキスト |$0.2600/M入力 |$1.56/M出力
1M トークン

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...

による |2月 2026 |262K コンテキスト |$0.3900/M入力 |$2.34/M出力
262K トークン

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...

による |2月 2026 |205K コンテキスト |Miễn phí input |Miễn phí output
205K トークン

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...

による |2月 2026 |205K コンテキスト |$0.1500/M入力 |$1.15/M出力
205K トークン

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading...

による |2月 2026 |203K コンテキスト |$0.6000/M入力 |$1.92/M出力
203K トークン

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...

による |2月 2026 |262K コンテキスト |$0.7800/M入力 |$3.90/M出力
262K トークン

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective...

による |2月 2026 |1M コンテキスト |$5.00/M入力 |$25.00/M出力
1M トークン

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per...

による |2月 2026 |262K コンテキスト |$0.1100/M入力 |$0.8000/M出力
262K トークン

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that...

による |2月 2026 |200K コンテキスト |Miễn phí input |Miễn phí output
200K トークン

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (教育省) architecture, it selectively activates only 11B of its 196B parameters per token....

による |Jan 2026 |262K コンテキスト |$0.1000/M入力 |$0.3000/M出力
262K トークン

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It excels in creative writing,...

による |Jan 2026 |131K コンテキスト |$0.1500/M入力 |$0.4500/M出力
131K トークン

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...

による |Jan 2026 |262K コンテキスト |$0.4000/M入力 |$1.90/M出力
262K トークン

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (教育省) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized...

による |Jan 2026 |128K コンテキスト |$0.1500/M入力 |$0.6000/M出力
128K トークン

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...

による |Jan 2026 |66K コンテキスト |$0.3000/M入力 |$1.20/M出力
66K トークン

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...

による |Jan 2026 |1M コンテキスト |$0.6000/M入力 |$6.00/M出力
1M トークン