I’m using the Groq Responses API with GPT-OSS models and noticed that reasoning tokens are not being counted in the usage statistics, even when reasoning is enabled and working correctly.
Environment:
- API: Responses API (beta)
- Models tested: openai/gpt-oss-20b and openai/gpt-oss-120b
- Endpoint: https://api.groq.com/openai/v1/responses
Issue:
When making requests with reasoning enabled (reasoning: { effort: ‘high’ }), the API:
Successfully generates reasoning content (visible in output array with type: ‘reasoning’)
Counts total tokens in usage.input_tokens and usage.output_tokens
Always returns 0 for usage.input_tokens_details.reasoning_tokens and
usage.output_tokens_details.reasoning_tokens
Example Request:
{
“model”: “openai/gpt-oss-120b”,
“input”: “What is 2 + 2? Think step by step.”,
“instructions”: “You are a helpful assistant.”,
“reasoning”: { “effort”: “high” },
“temperature”: 0.7
}
Actual Response (abbreviated):
{
“output”: [
{
“type”: “reasoning”,
“content”: [{“type”: “reasoning_text”, “text”: “Let me calculate…”}]
}
],
“usage”: {
“input_tokens”: 599,
“output_tokens”: 40,
“total_tokens”: 639,
“input_tokens_details”: {
“cached_tokens”: 0,
“reasoning_tokens”: 0 // ← Should be non-zero
},
“output_tokens_details”: {
“cached_tokens”: 0,
“reasoning_tokens”: 0 // ← Should be non-zero
}
}
}
Expected Behavior:
The reasoning_tokens fields should contain the actual count of tokens used for reasoning, separate from the
regular response tokens.
Questions:
- Is this a known limitation of the current beta implementation?
- Are there plans to implement proper reasoning token counting?
- Is there anything I need to configure differently to get reasoning tokens counted?
Thank you for your help!