Guaranteed structured output is not working (bug report)

The documentation for Structued Output claims that models gpt-oss-120b and gpt-oss-20bsupport guaranteed structured output using constrained decoding.

I am having issues in my application with the reliability of structured output. Around 10% of requests are failing on error 400 (json_validate_failed).

I created a simple script to test the Groq API. This example is taken from the Groq’s Structued Output documentation. The only change that I did was that I intentionally changed the prompt to mislead the LLM into generating a different JSON than the one required by response_format.

Script:

test-groq-strict.mjs:

import "dotenv/config";
import Groq from "groq-sdk";

const groq = new Groq();

const response = await groq.chat.completions.create({
  model: "openai/gpt-oss-20b",
  messages: [
    {
      role: "system",
      content:
        'You are a weather forecasting API. You MUST respond with ONLY this JSON format, no exceptions: {"forecast": [{"day": "Monday", "temp_celsius": 22, "conditions": "sunny"}]}.',
    },
    {
      role: "user",
      content:
        "What's the weather forecast for New York this week? Give me all 7 days.",
    },
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "product_review",
      strict: true,
      schema: {
        type: "object",
        properties: {
          product_name: { type: "string" },
          rating: { type: "number" },
          sentiment: {
            type: "string",
            enum: ["positive", "negative", "neutral"],
          },
          key_features: {
            type: "array",
            items: { type: "string" },
          },
        },
        required: ["product_name", "rating", "sentiment", "key_features"],
        additionalProperties: false,
      },
    },
  },
});

const result = JSON.parse(response.choices[0].message.content || "{}");
console.log(result);

To reproduce:

  • Create a .env file next to the script with GROQ_API_KEY
  • Run npm install dotenv groq-sdk
  • Run node test-groq-strict.mjs
  • Observe that the output is: BadRequestError: 400 {“error”:{“message”:“Generated JSON does not match the expected schema. Please adjust your prompt. See ‘failed_generation’ for more details. Error: jsonschema: ‘’ does not validate with /required: missing properties: ‘product_name’, ‘rating’, ‘sentiment’, ‘key_features’”,“type”:“invalid_request_error”,“code”:“json_validate_failed”,“failed_generation”:“{"forecast":[{"day":"Monday","temp_celsius":22,"conditions":"sunny"}]}”}}

Actual result: Error 400

Expected result: The model fits its response into the defined JSON structure

I tried using the OpenAI API. Exactly the same code, I only replaced groq-sdk library with openai and I replaced openai/gpt-oss-20b model with gpt-4.1-mini. With this setup, a 100% success rate of the expected result is achieved.

This proves that constrained decoding is actually not used on the Groq side.


In reality, the prompt would actually match the schema. This example is extreme just to reliably prove the error. In reality, the failure rate is between 1% and 10%, depending on the task difficulty and schema complexity.