gpt-oss-120b 是一个开放重量, 117B-parameter Mixture-of-Experts (MoE) OpenAI 专为高级推理而设计的语言模型, 代理的, 和通用生产用例. It activates 5.1B parameters per forward pass and is optimized...
AI Models
gpt-oss-120b 是一个开放重量, 117B-parameter Mixture-of-Experts (MoE) OpenAI 专为高级推理而设计的语言模型, 代理的, 和通用生产用例. It activates 5.1B parameters per forward pass and is optimized...
gpt-oss-20b是OpenAI在Apache下发布的开放权重21B参数模型 2.0 license. 它使用专家组合 (MoE) 每个前向传递具有 3.6B 活动参数的架构, optimized for...
gpt-oss-20b是OpenAI在Apache下发布的开放权重21B参数模型 2.0 license. 它使用专家组合 (MoE) 每个前向传递具有 3.6B 活动参数的架构, optimized for...
近距离工作 4.1 是 Anthropic 旗舰型号的更新版本, 提供改进的编码性能, 推理, 和代理任务. 它达到了 74.5% on SWE-bench Verified and shows notable gains...
Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test generation. [Blog Post](https://mistral.ai/news/codestral-25-08)
Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...
Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thinking mode and is designed for high-quality instruction following, multilingual understanding, and...
GLM-4.5是我们最新的旗舰基础型号, 专为基于代理的应用程序而构建. 它利用了专家的混合力量 (MoE) 架构并支持高达 128k 令牌的上下文长度. GLM-4.5 delivers significantly...
GLM-4.5-Air 是我们最新旗舰型号系列的轻量级变体, 也是专门为以代理为中心的应用程序而构建的. 类似GLM-4.5, 它采用专家混合 (MoE) architecture but with a more compact parameter...
GLM-4.5-Air 是我们最新旗舰型号系列的轻量级变体, 也是专门为以代理为中心的应用程序而构建的. 类似GLM-4.5, 它采用专家混合 (MoE) architecture but with a more compact parameter...
Qwen3-235B-A22B-Thinking-2507是一款高性能, 开放重量专家混合 (MoE) 针对复杂推理任务优化的语言模型. 每次前向传递它都会激活 235B 参数中的 22B,并且本身支持最多 262,144...
GLM 4 32B 是一种高性价比的基础语言模型. 能够高效执行复杂任务,工具使用能力显着增强, 网上搜索, 和代码相关的智能任务. It...
Qwen3-Coder-480B-A35B-Instruct 是专家的混合体 (MoE) Qwen团队开发的代码生成模型. 它针对代理编码任务(例如函数调用)进行了优化, tool use, and long-context reasoning over...
Qwen3-Coder-480B-A35B-Instruct 是专家的混合体 (MoE) Qwen团队开发的代码生成模型. 它针对代理编码任务(例如函数调用)进行了优化, tool use, and long-context reasoning over...
UI-TARS-1.5 是一种多模式视觉语言代理,针对基于 GUI 的环境进行了优化, 包括桌面界面, 网络浏览器, 移动系统, 和游戏. 由字节跳动打造, it builds upon the UI-TARS framework with reinforcement...
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...
Qwen3-235B-A22B-Instruct-2507 是一款多语言, 基于 Qwen3-235B 架构的指令调整专家混合语言模型, 每个前向传递有 22B 个活动参数. 它针对通用文本生成进行了优化, 包括以下指令,...
Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...
Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...
Devstral Medium is a high-performance code generation and agentic reasoning model developed jointly by Mistral AI and All Hands AI. Positioned as a step up from Devstral Small, it achieves...
Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Finetuned from Mistral Small 3.1 and...
Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving...
Hunyuan-A13B 是一款 13B 主动参数 Mixture-of-Experts (MoE) 腾讯开发的语言模型, 总参数数为 80B,支持通过 Chain-of-Thought 进行推理. It offers competitive benchmark...
Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transformations. The model requires the prompt to be in the following format: {instruction} {initial_code}...
Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The model requires the prompt to be in the following format: {instruction} {initial_code} {edit_snippet}...
ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...
ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE 4.5 series. It activates 47B parameters per token and supports text generation in...
Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...
MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it...
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, 数学, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, 数学, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...
The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently...
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, 数学, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...
May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...
近距离工作 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...
克劳德十四行诗 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...
Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...
Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning and multimodal performance with 8× lower cost...
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, 数学, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...
Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text grounding tasks. It offers a 32 k‑token context window, enabling rich multimodal...
Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic. Compared to the earlier 7 B...
Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...
Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...
Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...
Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique...
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...
Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...
Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and...







