AI Models

20 models 무료 & Paid 업데이트: 7 hours trước

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including…

by |4월 2026 |262K context |$0.1300/M input |$0.4000/M output
262K tokens

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding,…

by |4월 2026 |262K context |$0.1400/M input |$0.4000/M output
262K tokens

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. 에 비해 3.5 시리즈, it delivers major gains in agentic coding, front-end development, and overall reasoning,…

by |4월 2026 |1M context |Miễn phí input |Miễn phí output
1M tokens

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete…

by |4월 2026 |203K context |$1.20/M input |$4.00/M output
203K tokens

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. It is free in open claw for the first five days. Launch video:…

by |4월 2026 |262K context |$0.2200/M input |$0.8500/M output
262K tokens

그록 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 협업을 위해 설계된, 에이전트 기반 워크플로. 여러 에이전트가 병렬로 작동하여 심층적인 연구를 수행합니다., 좌표 도구 사용, 복잡한 작업 전반에 걸쳐 정보를 종합합니다.. 추론 노력 행동: - low / medium:…

by |3월 2026 |2M context |$2.00/M input |$6.00/M output
2M tokens

그록 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the…

by |3월 2026 |2M context |$2.00/M input |$6.00/M output
2M tokens

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models…

by |3월 2026 |1M context |Miễn phí input |Miễn phí output
1M tokens

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images.…

by |3월 2026 |1M context |Miễn phí input |Miễn phí output
1M tokens

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions, with a focus on large-scale production environments, multi-system coordination, and…

by |3월 2026 |256K context |$0.3000/M input |$1.20/M output
256K tokens

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding, video analysis, object detection, and agentic tool-use.

by |3월 2026 |16K context |$0.1000/M input |$0.1000/M output

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, 도구 사용, and code execution - making it well-suited…

by |3월 2026 |262K context |$0.4000/M input |$2.00/M output
262K tokens

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the…

by |3월 2026 |1M context |$1.00/M input |$3.00/M output
1M tokens

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7은 다중 에이전트 협업을 통해 고급 에이전트 기능을 통합합니다., 계획을 세울 수 있게 해준다, 실행하다, 복잡한 작업을 개선하고…

by |3월 2026 |205K context |$0.3000/M input |$1.20/M output
205K tokens

GPT-5.4 nano는 GPT-5.4 제품군 중 가장 가볍고 비용 효율적인 변형입니다., 속도가 중요한 대용량 작업에 최적화됨. 텍스트 및 이미지 입력을 지원하며 분류와 같은 지연 시간이 짧은 사용 사례를 위해 설계되었습니다., 데이터 추출, 순위, 그리고 하위 에이전트…

by |3월 2026 |400K context |$0.2000/M input |$1.25/M output
400K tokens

GPT-5.4 mini는 GPT-5.4의 핵심 기능을 더욱 빠르게 제공합니다., 처리량이 많은 워크로드에 최적화된 보다 효율적인 모델. 추론 전반에 걸쳐 강력한 성능으로 텍스트 및 이미지 입력을 지원합니다., 코딩, 그리고 도구 사용, 대규모 작업의 지연 시간과 비용을 줄이면서…

by |3월 2026 |400K context |$0.7500/M input |$4.50/M output
400K tokens

미스트랄 스몰 4 Mistral Small 제품군의 다음 주요 릴리스입니다., 여러 주력 Mistral 모델의 기능을 단일 시스템으로 통합. Magistral의 강력한 추론을 결합합니다., Pixtral의 다중 모드 이해, 에이전트 코딩 기능

by |3월 2026 |262K context |$0.1500/M input |$0.6000/M output
262K tokens

GLM-5 Turbo는 OpenClaw 시나리오와 같은 에이전트 중심 환경에서 빠른 추론과 강력한 성능을 위해 설계된 Z.ai의 새로운 모델입니다.. 긴 실행 체인이 포함된 실제 에이전트 워크플로우에 최적화되어 있습니다., 향상된 복잡한 명령 분해 기능, 도구…

by |3월 2026 |203K context |$1.20/M input |$4.00/M output
203K tokens

엔비디아 네모트론 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50%…

by |3월 2026 |262K context |Miễn phí input |Miễn phí output
262K tokens

엔비디아 네모트론 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50%…

by |3월 2026 |262K context |$0.1000/M input |$0.5000/M output
262K tokens