Reflexiones sobre los modelos abiertos

Algunos modelos abiertos admiten un modo de "pensamiento" que les permite razonar paso a paso antes de dar una respuesta final. Esto es útil para tareas que requieren una lógica transparente, como demostraciones matemáticas, depuración de código complejo o planificación de agentes de varios pasos.

Información específica de modelos

En las siguientes secciones se ofrecen directrices específicas de los modelos para reflexionar.

DeepSeek R1 0528

En DeepSeek R1 0528, el razonamiento se incluye entre etiquetas <think></think> en el campo content. No hay ningún campo reasoning_content.

Petición de ejemplo:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/endpoints/openapi/chat/completions -d '{
  "model": "deepseek-ai/deepseek-r1-0528-maas",
  "messages": [{
    "role": "user",
    "content": "Who are you?"
  }]
}'

Respuesta de ejemplo:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "<think>\nHmm, the user just asked \u201cWho are you?\u201d -
        a simple but fundamental question. \n\nFirst, let's consider the
        context. This is likely their first interaction, so they're probably
        testing the waters or genuinely curious about what I am. The phrasing is
        neutral - no urgency or frustration detected. \n\nI should keep my
        response warm and informative without being overwhelming. Since they
        didn't specify language preference, I'll default to English but note
        they might be multilingual. \n\nKey points to cover:\n- My identity
        (DeepSeek-R1)\n- My capabilities \n- My purpose (being helpful)\n- Tone:
        friendly with emojis to seem approachable\n- Ending with an open
        question to continue conversation\n\nThe smiley face feels appropriate
        here - establishes friendliness. Mentioning \u201cAI assistant\u201d
        upfront avoids confusion. Listing examples of what I can do gives
        concrete value. Ending with \u201cHow can I help?\u201d turns it into an
        active conversation starter. \n\nNot adding too much technical detail
        (like model specs) since that might overwhelm a first-time user. The
        \u201calways learning\u201d phrase subtly manages expectations about my
        limitations.\n</think>\nI'm DeepSeek-R1, your friendly AI assistant!
        \ud83d\ude0a \nI'm here to help you with questions, ideas, problems, or
        just a chat\u2014whether it's about homework, writing, coding, career
        advice, or fun trivia. I can read files (like PDFs, Word, Excel, etc.),
        browse the web for you (if enabled), and I'm always learning to be more
        helpful!\n\nSo\u2026 how can I help you today? \ud83d\udcac\u2728",
        "role": "assistant"
      }
    }
  ],
  "created": 1758229877,
  "id": "dXXMaKrsKqaQm9IPsLeTkAQ",
  "model": "deepseek-ai/deepseek-r1-0528-maas",
  "object": "chat.completion",
  "system_fingerprint": "",
  "usage": {
    "completion_tokens": 329,
    "prompt_tokens": 9,
    "total_tokens": 338
  }
}

DeepSeek v3.1

DeepSeek-V3.1 admite el razonamiento híbrido, que te permite activar o desactivar el razonamiento. De forma predeterminada, el razonamiento está desactivado. Para habilitarlo, incluye "chat_template_kwargs": { "thinking": true } en tu solicitud.

Cuando el razonamiento está habilitado, el texto de reflexión aparece al principio del campo content, seguido de </think> y, a continuación, el texto de la respuesta (por ejemplo, THINKING_TEXT</think>ANSWER_TEXT). Ten en cuenta que no hay ninguna etiqueta <think> inicial.

Petición de ejemplo:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-west2-aiplatform.googleapis.com/v1/projects/test-project/locations/us-west2/endpoints/openapi/chat/completions -d '{
  "model": "deepseek-ai/deepseek-v3.1-maas",
  "messages": [{
    "role": "user",
    "content": "What are the first 3 even prime numbers?"
  }],
  "chat_template_kwargs": {
    "thinking": true
  }
}'

Respuesta de ejemplo:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "matched_stop": 1,
      "message": {
        "content": "SNIPPED</think>The only even prime number is 2. By
        definition, a prime number is a natural number greater than 1 that has
        no positive divisors other than 1 and itself. Since all even numbers
        other than 2 are divisible by 2 and themselves, they have at least three
        divisors and are not prime. Therefore, there are no second or third even
        prime numbers. The concept of \"first three even prime numbers\" is not
        applicable beyond the first one.",
        "reasoning_content": null,
        "role": "assistant",
        "tool_calls": null
      }
    }
  ],
  "created": 1758229458,
  "id": "ynPMaIWPN-KeltsPh87s0Qg",
  "model": "deepseek-ai/deepseek-v3.1-maas",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 1525,
    "prompt_tokens": 17,
    "prompt_tokens_details": null,
    "total_tokens": 1542
  }
}

Qwen3-Next Thinking

En el caso de los modelos Qwen3 - Next Thinking, los tokens de razonamiento se encuentran en el campo reasoning_content y el texto de respuesta normal, en el campo content.

Petición de ejemplo:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://aiplatform.googleapis.com/v1/projects/test-project/locations/global/endpoints/openapi/chat/completions -d '{
  "model": "qwen/qwen3-next-80b-a3b-thinking-maas",
  "messages": [{
    "role": "user",
    "content": "Who are you?"
  }]
}'

Respuesta de ejemplo:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "matched_stop": 151645,
      "message": {
        "content": "Hello! I'm Qwen, a large language model developed by Alibaba
        Cloud. I can help answer questions, write stories, emails, scripts,
        perform logical reasoning, code, and more. How can I assist you today?
        \ud83d\ude0a",
        "reasoning_content": "Okay, the user asked, Who are you?
        I need to respond appropriately. First, I should state my name clearly:
        I am Qwen, a large language model developed by Alibaba Cloud. Then,
        maybe mention my capabilities, like answering questions, writing
        stories, emails, scripts, logical reasoning, programming, etc. Also,
        highlight that I can express views and play games. But keep it concise
        since the user might just want a brief intro.\n\nWait, the user might
        not know much about me, so I should explain what I can do in simple
        terms. Maybe add something about being trained on a large amount of
        data, so I can handle various topics. But don't get too technical. Also,
        maybe mention that I'm here to help. Should I ask if they have any
        specific questions? That might be good to invite further
        interaction.\n\nLet me check the structure. Start with a greeting,
        introduce myself, list key abilities, and offer assistance. Keep it
        friendly and approachable. Avoid jargon. Maybe say something like \"I'm
        Qwen, a large language model created by Alibaba Cloud. I can help with
        answering questions, writing, coding, and more. How can I assist you
        today?\" That seems good.\n\nWait, the original response in Chinese
        might be different, but the user asked in English, so the response
        should be in English. Let me confirm. The user's query is in English, so
        I should respond in English. Yes.\n\nAlso, check for any specific
        details. Maybe mention that I have 128K context length, but maybe that's
        too technical. Maybe just say \"I have a large context window\" or not
        necessary. For a basic introduction, maybe stick to the main points
        without too many specs. The user might not need the technical details
        right away.\n\nSo, the response should be: Hello! I'm Qwen, a large
        language model developed by Alibaba Cloud. I can help answer questions,
        write stories, emails, scripts, perform logical reasoning, code, and
        more. How can I assist you today? That's concise and covers the main
        points.\n\nWait, in the initial problem statement, the user's question
        is \"Who are you?\" So the standard response would be to introduce
        myself. Yes. Also, maybe add that I'm an AI language model, but the name
        Qwen is important. So the key elements are name, developer,
        capabilities, and offer help.\n\nYes. Let me make sure there's no
        mistake. Alibaba Cloud is the correct developer. Yes. So the response
        should be accurate. Alright, that seems good.\n",
        "role": "assistant",
        "tool_calls": null
      }
    }
  ],
  "created": 1758229652,
  "id": "knTMaJC0EJfM5OMP7I3xkAk",
  "metadata": {
    "weight_version": "default"
  },
  "model": "qwen/qwen3-next-80b-a3b-thinking-maas",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 573,
    "prompt_tokens": 14,
    "prompt_tokens_details": null,
    "reasoning_tokens": 0,
    "total_tokens": 587
  }
}

GPT OSS

En los modelos de OSS de GPT, el texto de reflexión se encuentra en el campo reasoning_content y el texto de respuesta normal, en el campo content. Estos modelos también admiten el parámetro reasoning_effort.

Petición de ejemplo:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/endpoints/openapi/chat/completions -d '{
  "model": "openai/gpt-oss-120b-maas",
  "messages": [{
    "role": "user",
    "content": "Who are you?"
  }],
  "reasoning_effort": "high"
}'

Respuesta de ejemplo:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "matched_stop": 200002,
      "message": {
        "content": "I\u2019m ChatGPT, a large language model created by OpenAI
        (based on the GPT\u20114 architecture). I\u2019m here to help answer
        your questions, brainstorm ideas, explain concepts, and assist with a
        wide variety of topics. Let me know what you\u2019d like to talk
        about!",
        "reasoning_content": "The user asks: \"Who are you?\" It's a simple
        question. They want to know who the assistant is. According to policy,
        we should respond with a brief, friendly description: \"I am ChatGPT, a
        large language model trained by OpenAI.\" Possibly mention the version
        (GPT-4). Also mention we are here to assist. There's no request for
        disallowed content. So a straightforward answer. Possibly ask if they
        need help. Provide a short intro.\n\nThus answer: \"I am ChatGPT, a
        large language model trained by OpenAI. I'm here to help answer your
        questions...\"\n\nWe should avoid disallowed content. It's fine.\n\nThus
        final answer.",
        "role": "assistant",
        "tool_calls": null
      }
    }
  ],
  "created": 1758229799,
  "id": "JnXMaJq6HLGzquEP4OD18Qg",
  "metadata": {
    "weight_version": "default"
  },
  "model": "openai/gpt-oss-120b-maas",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 201,
    "prompt_tokens": 71,
    "prompt_tokens_details": null,
    "reasoning_tokens": 0,
    "total_tokens": 272
  }
}

Siguientes pasos