Structured Outputs ignored by openai/gpt-oss-120b

Summary
response_format: { type: "json_schema", ... } (including strict: true) is ignored by openai/gpt-oss-120b. Instead of returning JSON conforming to the schema, the model returns free-form text. This used to work; looks like a regression (possibly introduced around caching changes).

Environment / Endpoint
Chat Completions API (/openai/v1/chat/completions), model openai/gpt-oss-120b.

Docs reference
Reproduced with the minimal example from your Structured Outputs docs: Structured Outputs - GroqDocs .

Steps to Reproduce

  1. Send the following request:
{
  "model": "openai/gpt-oss-120b",
  "messages": [
    { "role": "system", "content": "Extract product review information from the text." },
    { "role": "user", "content": "what time is now?" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "product_review",
      "schema": {
        "type": "object",
        "properties": {
          "product_name": { "type": "string" },
          "rating": { "type": "number" },
          "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
          "key_features": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["product_name", "rating", "sentiment", "key_features"],
        "additionalProperties": false,
        "strict": true
      }
    }
  }
}

Actual Result

{
  "id": "chatcmpl-88021918-680b-4483-9d22-3a8eda9cd7b4",
  "object": "chat.completion",
  "created": 1760997747,
  "model": "openai/gpt-oss-120b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I’m not able to access real-time information, so I can’t tell you the current time. You can check a clock, phone, or computer for the most accurate time.",
        "reasoning": "…"
      },
      "finish_reason": "stop"
    }
  ],
  "x_groq": { "id": "req_01k81ta988ffkv52194qcebypp" },
  "service_tier": "on_demand"
}

Expected Result
Either a valid JSON object that matches the schema, e.g.:

{
  "product_name": "",
  "rating": 0,
  "sentiment": "neutral",
  "key_features": []
}

— or an explicit error/refusal stating the model cannot produce output matching the provided JSON Schema. In all cases, no free-form prose outside JSON.

Notes

  • Reproducible 100% with the minimal example from your Structured Outputs documentation.
  • response_format.json_schema appears to be ignored by openai/gpt-oss-120b (including strict: true and additionalProperties: false).
  • Presence of message.reasoning suggests a response mode incompatible with Structured Outputs.

Impact
Breaks downstream parsing and validation; machine-readable output cannot be relied upon.

Request
Could you please take a look and let me know if this is a known issue? I’d really appreciate any guidance or a suggested workaround—e.g., a model that reliably enforces JSON Schema or a parameter/flag that ensures response_format is honored. I’m happy to share additional logs, run targeted tests on my side, or help validate a fix. Thanks a lot for your help!

1 Like

Thank you for the reproduction, this is a problem in our harness and I’ll ask the team to take a look.

We’ll soon be rolling out constrained decoding, which will fix some of these issues — but when given a rogue prompt like “what time is it” the LLM’s answer might still be rogue, even when constrained decoding is in place, if it doesn’t know how to follow the prompt.

I’m curious though, what do you think should be the expected output here, an empty object? Or should the LLM make something up that fits the schema but and tries to answer the original system prompt and the question?

If I were building out this app for myself, instead of waiting for our our constrained decoding to roll out, I’d probably want to add a small “router” model that discards prompts that don’t fit the use case (e.g. asks for time when it’s about product review extraction), and then another harness at the end that checks if the schema is correct (w/ zod/pydantic), and if not, retries or logs and throws an error. For this to work though, I’d need an expected result to check against, but I’m not really sure what that is, in this case

Hello!
Thank you for your reply. I’m very glad to hear you’re working on improving decoding.

I used an intentionally exaggerated example. In real life it’s much worse: you pass a schema to the model and expect a response that follows it, but the model sometimes returns an invalid JSON object. As an application developer this can be a dead end — you end up adding complex extra models and logic, which increases latency and degrades the user experience.

As for how the model should behave, I think it’s straightforward: I expect the behavior described in OpenAI’s documentation:

https://platform.openai.com/docs/guides/structured-outputs

“Structured Outputs is the evolution of JSON mode. While both ensure valid JSON is produced, only Structured Outputs ensure schema adherence.”

If you run the example on OpenAI models, you get the expected behavior.

Example request:

{
  "model": "gpt-4.1-mini",
  "messages": [
    { "role": "system", "content": "Extract product review information from the text." },
    { "role": "user", "content": "what time is now?" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "product_review",
      "schema": {
        "type": "object",
        "properties": {
          "product_name": { "type": "string" },
          "rating": { "type": "number" },
          "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
          "key_features": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["product_name", "rating", "sentiment", "key_features"],
        "additionalProperties": false,
        "strict": true
      }
    }
  }
}

Result that matches my expectations and OpenAI’s docs:

{
    "id": "chatcmpl-CTDBk9vfXY1f6VTutnE7YM4NFGQ04",
    "object": "chat.completion",
    "created": 1761078244,
    "model": "gpt-4.1-mini-2025-04-14",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "{\"product_name\":\"\",\"rating\":0,\"sentiment\":\"neutral\",\"key_features\":[]}",
                "refusal": null,
                "annotations": []
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 92,
        "completion_tokens": 17,
        "total_tokens": 109,
        "prompt_tokens_details": {
            "cached_tokens": 0,
            "audio_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0,
            "audio_tokens": 0,
            "accepted_prediction_tokens": 0,
            "rejected_prediction_tokens": 0
        }
    },
    "service_tier": "default",
    "system_fingerprint": "fp_c064fdde7c"
}

I hope you can achieve the same behavior.

And thank you for all your work!

1 Like

I’ve been playing around and experimenting a bit, I’ve found that adding If the user prompt is unanswerable, return and empty product object that follows the json schema to the system prompt of the toy example, I’m getting 100% adherence (across 40 tests).

I even switched to gpt-oss-20b without degradation

Content is: "content": "{\"product_name\":\"\",\"rating\":0,\"sentiment\":\"neutral\",\"key_features\":[]}" which I think is expected.

Here’s the full json:

{
  "model": "openai/gpt-oss-20b",
  "messages": [
    { "role": "system", "content": "Extract product review information from the text. If the user prompt is unanswerable, return and empty product object that follows the json schema" },
    { "role": "user", "content": "what time is now?" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "product_review",
      "schema": {
        "type": "object",
        "properties": {
          "product_name": { "type": "string" },
          "rating": { "type": "number" },
          "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
          "key_features": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["product_name", "rating", "sentiment", "key_features"],
        "additionalProperties": false,
        "strict": true
      }
    }
  }
}