Reasoning Models

Sử dụng models với khả năng suy luận nâng cao (Chain-of-Thought).

Một số models hỗ trợ reasoning — model sẽ "suy nghĩ" trước khi trả lời, tạo ra câu trả lời chính xác hơn cho các bài toán phức tạp.

Models hỗ trợ Reasoning

Model	Provider	Ghi chú
`openai/o1`	OpenAI	Reasoning mạnh nhất, phù hợp toán, logic
`openai/o1-mini`	OpenAI	Reasoning nhanh, tiết kiệm hơn o1
`openai/o3-mini`	OpenAI	Reasoning generation 3, cân bằng tốc độ-chất lượng
`deepseek/deepseek-r1`	DeepSeek	Open-source reasoning model
`anthropic/claude-sonnet-4`	Anthropic	Extended thinking với `thinking` parameter
`google/gemini-2.5-flash-preview`	Google	Reasoning nhanh, giá rẻ

Cách sử dụng

Reasoning models sử dụng giống Chat Completion. Một số model trả về reasoning_content riêng:

json

{
  "model": "deepseek/deepseek-r1",
  "messages": [
    {
      "role": "user",
      "content": "Nếu 3 công nhân xây 1 bức tường trong 4 giờ, hỏi 6 công nhân xây 2 bức tường mất bao lâu?"
    }
  ],
  "temperature": 0
}

Response với Reasoning

Một số models (DeepSeek R1, Claude extended thinking) trả về reasoning_content trong message:

json

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "reasoning_content": "Bước 1: 3 công nhân xây 1 tường = 4h → 1 công nhân = 12 man-hours/tường\nBước 2: 2 tường = 24 man-hours\nBước 3: 6 công nhân → 24/6 = 4 giờ",
        "content": "6 công nhân xây 2 bức tường mất **4 giờ**."
      },
      "finish_reason": "stop"
    }
  ]
}

ℹ️ Reasoning tokens Tokens reasoning được tính vào completion_tokens trong usage. Chi phí cao hơn vì model generate nhiều tokens hơn visible output.

Extended Thinking (Claude)

Với Claude Sonnet 4, bạn có thể bật extended thinking bằng header:

json

{
  "model": "anthropic/claude-sonnet-4",
  "messages": [
    { "role": "user", "content": "Phân tích chi tiết bài toán tối ưu hóa này..." }
  ],
  "temperature": 1,
  "max_tokens": 16000
}

Best Practices cho Reasoning

Không dùng system prompt quá chi tiết — reasoning models tự suy luận tốt hơn nếu bạn để chúng tự do.
Đặt temperature: 0 cho bài toán logic, math, code.
Tăng max_tokens — reasoning dùng nhiều tokens hơn thông thường.
Đừng hối — các reasoning model cần thời gian suy nghĩ, latency cao hơn standard chat.

← Trước Responses Tiếp → Lỗi & Debug

Reasoning Models

Models hỗ trợ Reasoning

Cách sử dụng

Response với Reasoning

Extended Thinking (Claude)

Best Practices cho Reasoning

Tài khoản

🔑 Lấy lại mật khẩu