Groq API Bug Report: Missing finish_reason in Streaming Responses
API: https://api.groq.com/openai/v1/chat/completions
Date: November 19, 2025
Severity: High (breaks OpenAI compatibility)
Status: Confirmed by SDK testing
Summary
The Groq API does not send finish_reason in streaming responses for certain models, violating OpenAI’s streaming specification and breaking OpenAI-compatible clients.
Confirmed by SDK Investigation
The Groq Go SDK team has confirmed this is an API issue, not an SDK bug:
SDK correctly handles finish_reasonwhen API sends it
Test added proving SDK works correctly
Python SDK shows same behavior (API doesn’t send finish_reason)
Diagnostic tools confirm API behavior
Reference: GitHub - ZaguanLabs/groq-go: Unofficial Groq SDK in Go (see FINISH_REASON_ANALYSIS.md)
Affected Models
Confirmed affected models:
moonshotai/kimi-k2-instruct-0905
qwen/qwen3-32b(Qwen-Code)
Likely affects other models as well.
OpenAI Specification Requirement
According to OpenAI’s streaming specification, the final chunk must contain:
{
"id": "chatcmpl-xxx",
"object": "chat.completion.chunk",
"created": 1234567890,
"model": "model-name",
"choices": [{
"index": 0,
"delta": {},
"finish_reason": "stop" // ← REQUIRED
}]
}
Followed by:
data: [DONE]
Actual Groq API Behavior
The Groq API sends:
// Last content chunk
{
"id": "chatcmpl-xxx",
"model": "moonshotai/kimi-k2-instruct-0905",
"choices": [{
"index": 0,
"delta": {"content": "final word"},
"finish_reason": "" // ← Empty
}]
}
// Empty chunk (no finish_reason)
{
"id": "chatcmpl-xxx",
"model": "",
"choices": [] // ← No choices, no finish_reason
}
data: [DONE]
Missing: A chunk with finish_reason: "stop" before [DONE]
Production Evidence
Raw Logs from Zaguan CoreX
{"time":"2025-11-19T16:53:36.135620192+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"moonshotai/kimi-k2-instruct-0905","choices_count":1,"has_usage":false}
{"time":"2025-11-19T16:53:36.142579114+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"moonshotai/kimi-k2-instruct-0905","choices_count":1,"has_usage":false}
{"time":"2025-11-19T16:53:36.149212213+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"moonshotai/kimi-k2-instruct-0905","choices_count":1,"has_usage":false}
{"time":"2025-11-19T16:53:36.370208389+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"","choices_count":0,"has_usage":false}
{"time":"2025-11-19T16:53:36.370226223+01:00","level":"DEBUG","msg":"Skipping empty chunk (no choices, no usage)"}
{"time":"2025-11-19T16:53:36.371322536+01:00","level":"DEBUG","msg":"Skipping nil chunk from Groq SDK stream"}
{"time":"2025-11-19T16:53:36.371336872+01:00","level":"DEBUG","msg":"Stream ended without finish reason, sending final chunk"}
Note: No chunk with finish_reason ever received from API.
Impact
Broken Clients
OpenWebUI: “API Error: Model stream ended without a finish reason”
Langchain: May fail validation
LlamaIndex: May fail validation
Any OpenAI-compatible client: Expects finish_reason
Workarounds Required
All OpenAI-compatible proxies must implement workarounds:
// Zaguan CoreX workaround (v0.37.0-beta7)
sawFinishReason := false
for {
chunk, err := stream.Next(ctx)
if err != nil {
if errors.Is(err, io.EOF) {
if !sawFinishReason {
// Synthesize finish_reason since API didn't send it
outputChan <- ChatResponseChunk{
Model: modelName,
Choices: []Choice{{
Index: 0,
FinishReason: "stop",
Delta: ChatMessage{},
}},
}
}
return
}
}
for _, choice := range chunk.Choices {
if choice.FinishReason != "" {
sawFinishReason = true
}
}
}
Reproduction
cURL Test
curl -X POST https://api.groq.com/openai/v1/chat/completions \
-H "Authorization: Bearer $GROQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "moonshotai/kimi-k2-instruct-0905",
"messages": [{"role": "user", "content": "Say hello"}],
"stream": true
}' \
--no-buffer
Expected: Final chunk with finish_reason: "stop" before [DONE]
Actual: Empty chunk, then [DONE], no finish_reason
Go SDK Test
cd groq-go/groq/examples/debug_finish_reason
go run main.go
Output: “Stream ended without finish_reason”
Comparison with OpenAI API
OpenAI Behavior (Correct)
data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":""}]}
data: {"choices":[{"delta":{"content":"!"},"finish_reason":""}]}
data: {"choices":[{"delta":{},"finish_reason":"stop"}]} ← finish_reason sent
data: [DONE]
Groq Behavior (Incorrect)
data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":""}]}
data: {"choices":[{"delta":{"content":"!"},"finish_reason":""}]}
data: {"model":"","choices":[]} ← No finish_reason
data: [DONE]
Expected Fix
The Groq API should send a final chunk before [DONE]:
{
"id": "chatcmpl-xxx",
"object": "chat.completion.chunk",
"created": 1234567890,
"model": "moonshotai/kimi-k2-instruct-0905",
"choices": [{
"index": 0,
"delta": {},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
}
}
This matches OpenAI’s behavior and satisfies the specification.
Finish Reason Values
According to OpenAI spec, finish_reason can be:
"stop"- Natural completion"length"- Max tokens reached"content_filter"- Content filtered"tool_calls"- Function/tool call made
The API should send the appropriate value based on how the generation ended.
Testing Checklist
To verify the fix:
- Test with
moonshotai/kimi-k2-instruct-0905 - Test with
qwen/qwen3-32b - Test with
llama-3.3-70b-versatile - Test with function calling (finish_reason: “tool_calls”)
- Test with max_tokens limit (finish_reason: “length”)
- Verify finish_reason comes BEFORE [DONE]
- Verify usage data is included in final chunk
- Test with both Go and Python SDKs
Related Documentation
- OpenAI Streaming Spec: https://platform.openai.com/docs/api-reference/streaming
- Groq SDK Analysis: groq-go/FINISH_REASON_ANALYSIS.md at main · ZaguanLabs/groq-go · GitHub
- Zaguan CoreX Workaround: v0.37.0-beta7
Priority
Critical: This breaks OpenAI compatibility and requires workarounds in all client implementations.
Proposed Timeline
- Immediate: Acknowledge issue
- Short-term: Fix API to send finish_reason for affected models
- Long-term: Ensure all models send finish_reason correctly
Contact
Reported by: Zaguan CoreX team
SDK Confirmation: ZaguanLabs/groq-go maintainers
Production Impact: Affecting all OpenAI-compatible clients
Appendix: SDK Team Response
From the Groq Go SDK team investigation (November 19, 2025):
Conclusion: This is an API Issue, Not an SDK Bug
After thorough investigation, we’ve determined:
- The SDK correctly handles
finish_reasonwhen the API sends it- The Groq API is not sending
finish_reasonfor certain models- The Python SDK has the same behavior
Why the SDK Should NOT Synthesize finish_reason
- Transparency: The SDK should return exactly what the API sends
- Debugging: Users need to know the actual API behavior
- Correctness: The SDK can’t know what finish_reason should be
We agree with this assessment. The fix must be at the API level.
Thank you for your attention to OpenAI compatibility! This fix will benefit all Groq API users.