Structured Outputs ignored by openai/gpt-oss-120b

Midvix · October 20, 2025, 10:13pm

Summary
response_format: { type: "json_schema", ... } (including strict: true) is ignored by openai/gpt-oss-120b. Instead of returning JSON conforming to the schema, the model returns free-form text. This used to work; looks like a regression (possibly introduced around caching changes).

Environment / Endpoint
Chat Completions API (/openai/v1/chat/completions), model openai/gpt-oss-120b.

Docs reference
Reproduced with the minimal example from your Structured Outputs docs: Structured Outputs - GroqDocs .

Steps to Reproduce

Send the following request:

{
  "model": "openai/gpt-oss-120b",
  "messages": [
    { "role": "system", "content": "Extract product review information from the text." },
    { "role": "user", "content": "what time is now?" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "product_review",
      "schema": {
        "type": "object",
        "properties": {
          "product_name": { "type": "string" },
          "rating": { "type": "number" },
          "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
          "key_features": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["product_name", "rating", "sentiment", "key_features"],
        "additionalProperties": false,
        "strict": true
      }
    }
  }
}

Actual Result

{
  "id": "chatcmpl-88021918-680b-4483-9d22-3a8eda9cd7b4",
  "object": "chat.completion",
  "created": 1760997747,
  "model": "openai/gpt-oss-120b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I’m not able to access real-time information, so I can’t tell you the current time. You can check a clock, phone, or computer for the most accurate time.",
        "reasoning": "…"
      },
      "finish_reason": "stop"
    }
  ],
  "x_groq": { "id": "req_01k81ta988ffkv52194qcebypp" },
  "service_tier": "on_demand"
}

Expected Result
Either a valid JSON object that matches the schema, e.g.:

{
  "product_name": "",
  "rating": 0,
  "sentiment": "neutral",
  "key_features": []
}

— or an explicit error/refusal stating the model cannot produce output matching the provided JSON Schema. In all cases, no free-form prose outside JSON.

Notes

Reproducible 100% with the minimal example from your Structured Outputs documentation.
response_format.json_schema appears to be ignored by openai/gpt-oss-120b (including strict: true and additionalProperties: false).
Presence of message.reasoning suggests a response mode incompatible with Structured Outputs.

Impact
Breaks downstream parsing and validation; machine-readable output cannot be relied upon.

Request
Could you please take a look and let me know if this is a known issue? I’d really appreciate any guidance or a suggested workaround—e.g., a model that reliably enforces JSON Schema or a parameter/flag that ensures response_format is honored. I’m happy to share additional logs, run targeted tests on my side, or help validate a fix. Thanks a lot for your help!

yawnxyz · October 21, 2025, 6:24pm

Thank you for the reproduction, this is a problem in our harness and I’ll ask the team to take a look.

We’ll soon be rolling out constrained decoding, which will fix some of these issues — but when given a rogue prompt like “what time is it” the LLM’s answer might still be rogue, even when constrained decoding is in place, if it doesn’t know how to follow the prompt.

I’m curious though, what do you think should be the expected output here, an empty object? Or should the LLM make something up that fits the schema but and tries to answer the original system prompt and the question?

If I were building out this app for myself, instead of waiting for our our constrained decoding to roll out, I’d probably want to add a small “router” model that discards prompts that don’t fit the use case (e.g. asks for time when it’s about product review extraction), and then another harness at the end that checks if the schema is correct (w/ zod/pydantic), and if not, retries or logs and throws an error. For this to work though, I’d need an expected result to check against, but I’m not really sure what that is, in this case

Midvix · October 21, 2025, 8:24pm

Hello!
Thank you for your reply. I’m very glad to hear you’re working on improving decoding.

I used an intentionally exaggerated example. In real life it’s much worse: you pass a schema to the model and expect a response that follows it, but the model sometimes returns an invalid JSON object. As an application developer this can be a dead end — you end up adding complex extra models and logic, which increases latency and degrades the user experience.

As for how the model should behave, I think it’s straightforward: I expect the behavior described in OpenAI’s documentation:

https://platform.openai.com/docs/guides/structured-outputs

“Structured Outputs is the evolution of JSON mode. While both ensure valid JSON is produced, only Structured Outputs ensure schema adherence.”

If you run the example on OpenAI models, you get the expected behavior.

Example request:

{
  "model": "gpt-4.1-mini",
  "messages": [
    { "role": "system", "content": "Extract product review information from the text." },
    { "role": "user", "content": "what time is now?" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "product_review",
      "schema": {
        "type": "object",
        "properties": {
          "product_name": { "type": "string" },
          "rating": { "type": "number" },
          "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
          "key_features": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["product_name", "rating", "sentiment", "key_features"],
        "additionalProperties": false,
        "strict": true
      }
    }
  }
}

Result that matches my expectations and OpenAI’s docs:

{
    "id": "chatcmpl-CTDBk9vfXY1f6VTutnE7YM4NFGQ04",
    "object": "chat.completion",
    "created": 1761078244,
    "model": "gpt-4.1-mini-2025-04-14",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "{\"product_name\":\"\",\"rating\":0,\"sentiment\":\"neutral\",\"key_features\":[]}",
                "refusal": null,
                "annotations": []
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 92,
        "completion_tokens": 17,
        "total_tokens": 109,
        "prompt_tokens_details": {
            "cached_tokens": 0,
            "audio_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0,
            "audio_tokens": 0,
            "accepted_prediction_tokens": 0,
            "rejected_prediction_tokens": 0
        }
    },
    "service_tier": "default",
    "system_fingerprint": "fp_c064fdde7c"
}

I hope you can achieve the same behavior.

And thank you for all your work!

yawnxyz · October 21, 2025, 10:33pm

I’ve been playing around and experimenting a bit, I’ve found that adding If the user prompt is unanswerable, return and empty product object that follows the json schema to the system prompt of the toy example, I’m getting 100% adherence (across 40 tests).

I even switched to gpt-oss-20b without degradation

Content is: "content": "{\"product_name\":\"\",\"rating\":0,\"sentiment\":\"neutral\",\"key_features\":[]}" which I think is expected.

Here’s the full json:

{
  "model": "openai/gpt-oss-20b",
  "messages": [
    { "role": "system", "content": "Extract product review information from the text. If the user prompt is unanswerable, return and empty product object that follows the json schema" },
    { "role": "user", "content": "what time is now?" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "product_review",
      "schema": {
        "type": "object",
        "properties": {
          "product_name": { "type": "string" },
          "rating": { "type": "number" },
          "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
          "key_features": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["product_name", "rating", "sentiment", "key_features"],
        "additionalProperties": false,
        "strict": true
      }
    }
  }
}

Topic		Replies	Views
OpenAI GPT-OSS Models doesn't support structure output Feature Requests	7	805	November 10, 2025
Structured Outputs not working with the moonshotai/kimi-k2-instruct-0905 Forum	2	261	September 9, 2025
GPT-OSS-120B - invalid JSON schema for response_format Forum	0	43	February 5, 2026
I am receiving 400 GPT-OSS-20b structured output Forum	0	62	January 15, 2026
Tool_use_failed on Llama4 models Forum	5	273	July 24, 2025

Structured Outputs ignored by openai/gpt-oss-120b

Related topics