Groq API Bug Report: Missing finish_reason in Streaming Responses

Zaguan_AI · November 19, 2025, 7:01pm

Groq API Bug Report: Missing finish_reason in Streaming Responses

API: https://api.groq.com/openai/v1/chat/completions
Date: November 19, 2025
Severity: High (breaks OpenAI compatibility)
Status: Confirmed by SDK testing

Summary

The Groq API does not send finish_reason in streaming responses for certain models, violating OpenAI’s streaming specification and breaking OpenAI-compatible clients.

Confirmed by SDK Investigation

The Groq Go SDK team has confirmed this is an API issue, not an SDK bug:

SDK correctly handles finish_reason when API sends it
Test added proving SDK works correctly
Python SDK shows same behavior (API doesn’t send finish_reason)
Diagnostic tools confirm API behavior

Reference: GitHub - ZaguanLabs/groq-go: Unofficial Groq SDK in Go (see FINISH_REASON_ANALYSIS.md)

Affected Models

Confirmed affected models:

moonshotai/kimi-k2-instruct-0905
qwen/qwen3-32b (Qwen-Code)

Likely affects other models as well.

OpenAI Specification Requirement

According to OpenAI’s streaming specification, the final chunk must contain:

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion.chunk",
  "created": 1234567890,
  "model": "model-name",
  "choices": [{
    "index": 0,
    "delta": {},
    "finish_reason": "stop"  // ← REQUIRED
  }]
}

Followed by:

data: [DONE]

Actual Groq API Behavior

The Groq API sends:

// Last content chunk
{
  "id": "chatcmpl-xxx",
  "model": "moonshotai/kimi-k2-instruct-0905",
  "choices": [{
    "index": 0,
    "delta": {"content": "final word"},
    "finish_reason": ""  // ← Empty
  }]
}

// Empty chunk (no finish_reason)
{
  "id": "chatcmpl-xxx",
  "model": "",
  "choices": []  // ← No choices, no finish_reason
}

data: [DONE]

Missing: A chunk with finish_reason: "stop" before [DONE]

Production Evidence

Raw Logs from Zaguan CoreX

{"time":"2025-11-19T16:53:36.135620192+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"moonshotai/kimi-k2-instruct-0905","choices_count":1,"has_usage":false}
{"time":"2025-11-19T16:53:36.142579114+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"moonshotai/kimi-k2-instruct-0905","choices_count":1,"has_usage":false}
{"time":"2025-11-19T16:53:36.149212213+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"moonshotai/kimi-k2-instruct-0905","choices_count":1,"has_usage":false}
{"time":"2025-11-19T16:53:36.370208389+01:00","level":"DEBUG","msg":"Groq SDK stream chunk received","model":"","choices_count":0,"has_usage":false}
{"time":"2025-11-19T16:53:36.370226223+01:00","level":"DEBUG","msg":"Skipping empty chunk (no choices, no usage)"}
{"time":"2025-11-19T16:53:36.371322536+01:00","level":"DEBUG","msg":"Skipping nil chunk from Groq SDK stream"}
{"time":"2025-11-19T16:53:36.371336872+01:00","level":"DEBUG","msg":"Stream ended without finish reason, sending final chunk"}

Note: No chunk with finish_reason ever received from API.

Impact

Broken Clients

OpenWebUI: “API Error: Model stream ended without a finish reason”
Langchain: May fail validation
LlamaIndex: May fail validation
Any OpenAI-compatible client: Expects finish_reason

Workarounds Required

All OpenAI-compatible proxies must implement workarounds:

// Zaguan CoreX workaround (v0.37.0-beta7)
sawFinishReason := false
for {
    chunk, err := stream.Next(ctx)
    if err != nil {
        if errors.Is(err, io.EOF) {
            if !sawFinishReason {
                // Synthesize finish_reason since API didn't send it
                outputChan <- ChatResponseChunk{
                    Model: modelName,
                    Choices: []Choice{{
                        Index:        0,
                        FinishReason: "stop",
                        Delta:        ChatMessage{},
                    }},
                }
            }
            return
        }
    }
    
    for _, choice := range chunk.Choices {
        if choice.FinishReason != "" {
            sawFinishReason = true
        }
    }
}

Reproduction

cURL Test

curl -X POST https://api.groq.com/openai/v1/chat/completions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/kimi-k2-instruct-0905",
    "messages": [{"role": "user", "content": "Say hello"}],
    "stream": true
  }' \
  --no-buffer

Expected: Final chunk with finish_reason: "stop" before [DONE]
Actual: Empty chunk, then [DONE], no finish_reason

Go SDK Test

cd groq-go/groq/examples/debug_finish_reason
go run main.go

Output: “Stream ended without finish_reason”

Comparison with OpenAI API

OpenAI Behavior (Correct)

data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":""}]}

data: {"choices":[{"delta":{"content":"!"},"finish_reason":""}]}

data: {"choices":[{"delta":{},"finish_reason":"stop"}]}  ← finish_reason sent

data: [DONE]

Groq Behavior (Incorrect)

data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":""}]}

data: {"choices":[{"delta":{"content":"!"},"finish_reason":""}]}

data: {"model":"","choices":[]}  ← No finish_reason

data: [DONE]

Expected Fix

The Groq API should send a final chunk before [DONE]:

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion.chunk",
  "created": 1234567890,
  "model": "moonshotai/kimi-k2-instruct-0905",
  "choices": [{
    "index": 0,
    "delta": {},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

This matches OpenAI’s behavior and satisfies the specification.

Finish Reason Values

According to OpenAI spec, finish_reason can be:

"stop" - Natural completion
"length" - Max tokens reached
"content_filter" - Content filtered
"tool_calls" - Function/tool call made

The API should send the appropriate value based on how the generation ended.

Testing Checklist

To verify the fix:

Test with moonshotai/kimi-k2-instruct-0905
Test with qwen/qwen3-32b
Test with llama-3.3-70b-versatile
Test with function calling (finish_reason: “tool_calls”)
Test with max_tokens limit (finish_reason: “length”)
Verify finish_reason comes BEFORE [DONE]
Verify usage data is included in final chunk
Test with both Go and Python SDKs

Priority

Critical: This breaks OpenAI compatibility and requires workarounds in all client implementations.

Proposed Timeline

Immediate: Acknowledge issue
Short-term: Fix API to send finish_reason for affected models
Long-term: Ensure all models send finish_reason correctly

Contact

Reported by: Zaguan CoreX team
SDK Confirmation: ZaguanLabs/groq-go maintainers
Production Impact: Affecting all OpenAI-compatible clients

Appendix: SDK Team Response

From the Groq Go SDK team investigation (November 19, 2025):

Conclusion: This is an API Issue, Not an SDK Bug

After thorough investigation, we’ve determined:

The SDK correctly handles finish_reason when the API sends it

The Groq API is not sending finish_reason for certain models

The Python SDK has the same behavior

Why the SDK Should NOT Synthesize finish_reason

Transparency: The SDK should return exactly what the API sends

Debugging: Users need to know the actual API behavior

Correctness: The SDK can’t know what finish_reason should be

We agree with this assessment. The fix must be at the API level.

Thank you for your attention to OpenAI compatibility! This fix will benefit all Groq API users.

yawnxyz · November 19, 2025, 7:05pm

Hi! Thank you for reporting this. I’m trying to repro your curl

curl --request POST \
    --url https://api.groq.com/openai/v1/chat/completions \
    --header 'authorization: Bearer MYKEY' \
    --header 'content-type: application/json' \
    --data '{
    "messages": [
        {
            "role": "user",
            "content": "say hello"
        }
    ],
    "model": "moonshotai/kimi-k2-instruct-0905",
    "stream": true
}'

and I consistently get


data: {"id":"chatcmpl-50b2f9bb-1e48-4d78-8837-409f01270736","object":"chat.completion.chunk","created":1763578973,"model":"moonshotai/kimi-k2-instruct-0905","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-50b2f9bb-1e48-4d78-8837-409f01270736","object":"chat.completion.chunk","created":1763578973,"model":"moonshotai/kimi-k2-instruct-0905","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"x_groq":{"id":"req_01kaeqz36fekwa2fzrthrerz2x","usage":{"queue_time":0.02321992,"prompt_tokens":28,"prompt_time":0.009536857,"completion_tokens":3,"completion_time":0.006776633,"total_tokens":31,"total_time":0.01631349}},"usage":{"queue_time":0.02321992,"prompt_tokens":28,"prompt_time":0.009536857,"completion_tokens":3,"completion_time":0.006776633,"total_tokens":31,"total_time":0.01631349}}

where I DO seem to get "finish_reason":"stop"

could you please run your suite again and test? I’m not sure why you didn’t get finish_reason, but I’m having a hard time reproducing

Zaguan_AI · November 19, 2025, 7:15pm

I’m using Qwen-Code with Kimi K2 and when it tries to call a tool it just stops when connecting through Zaguan.

I’ve also tried to connect directly to api.groq.com and am receiving an error there as well.

yawnxyz · November 19, 2025, 7:31pm

are you running the same curl as I did, but you’re getting no finish_reason still? can you run my curl now and paste the output?
(I’m trying to cut out SDKs so I can get to the root cause)

Zaguan_AI · November 19, 2025, 7:40pm

yawnxyz:

curl --request POST \
    --url https://api.groq.com/openai/v1/chat/completions \
    --header 'authorization: Bearer MYKEY' \
    --header 'content-type: application/json' \
    --data '{
    "messages": [
        {
            "role": "user",
            "content": "say hello"
        }
    ],
    "model": "moonshotai/kimi-k2-instruct-0905",
    "stream": true
}'

The issue is with streams and with several tool calls in a row. A single call works great and I’ve now tested both your curl command and a separate culr command with a tool call.
The responses are equal in both instances.

This does not, however, tell us why Qwen-Code just stops mid-call.

Zaguan_AI · November 19, 2025, 7:43pm

% curl --request POST \                                                                                                                               25-11-19 - 20:33:36
    --url https://api.groq.com/openai/v1/chat/completions \
    --header "authorization: Bearer $GROQ_API_KEY" \
    --header 'content-type: application/json' \
    --data '{
    "messages": [
        {
            "role": "user",
            "content": "say hello"
        }
    ],
    "model": "moonshotai/kimi-k2-instruct-0905",
    "stream": true
}'
data: {"id":"chatcmpl-542678e8-1eee-4fea-8726-bd1b64d53555","object":"chat.completion.chunk","created":1763580827,"model":"moonshotai/kimi-k2-instruct-0905","system_fingerprint":"fp_3312304636","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"x_groq":{"id":"req_01kaesqnjqfpz8fec1mgf75etq","seed":24420746}}

data: {"id":"chatcmpl-542678e8-1eee-4fea-8726-bd1b64d53555","object":"chat.completion.chunk","created":1763580827,"model":"moonshotai/kimi-k2-instruct-0905","system_fingerprint":"fp_3312304636","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-542678e8-1eee-4fea-8726-bd1b64d53555","object":"chat.completion.chunk","created":1763580827,"model":"moonshotai/kimi-k2-instruct-0905","system_fingerprint":"fp_3312304636","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-542678e8-1eee-4fea-8726-bd1b64d53555","object":"chat.completion.chunk","created":1763580827,"model":"moonshotai/kimi-k2-instruct-0905","system_fingerprint":"fp_3312304636","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"x_groq":{"id":"req_01kaesqnjqfpz8fec1mgf75etq","usage":{"queue_time":0.162589862,"prompt_tokens":28,"prompt_time":0.009505263,"completion_tokens":3,"completion_time":2.7e-7,"total_tokens":31,"total_time":0.009505533}},"usage":{"queue_time":0.162589862,"prompt_tokens":28,"prompt_time":0.009505263,"completion_tokens":3,"completion_time":2.7e-7,"total_tokens":31,"total_time":0.009505533}}

data: [DONE]

yawnxyz · November 21, 2025, 1:16am

Your example still displays finish_reason: stop though — are you able to give me a reproducible example I can use to find the root cause?

Topic		Replies	Views
Groq Python SDK Support for Responses API Feature Requests	4	68	August 12, 2025
Bug] GPT-OSS-120B: Reasoning tokens and gibberish output appearing in responses despite configuration to hide reasoning Forum	9	193	November 20, 2025
Gpt-oss-120b ignoring tools Forum	47	1904	December 7, 2025
GPT-oss-120b Reasoning Tokens Not Counted in Responses API Usage Statistics Forum	2	157	September 15, 2025
OpenAI GPT-OSS Models doesn't support structure output Feature Requests	7	601	November 10, 2025

Groq API Bug Report: Missing finish_reason in Streaming Responses

Groq API Bug Report: Missing finish_reason in Streaming Responses

Summary

Confirmed by SDK Investigation

Affected Models

OpenAI Specification Requirement

Actual Groq API Behavior

Production Evidence

Raw Logs from Zaguan CoreX

Impact

Broken Clients

Workarounds Required

Reproduction

cURL Test

Go SDK Test

Comparison with OpenAI API

OpenAI Behavior (Correct)

Groq Behavior (Incorrect)

Expected Fix

Finish Reason Values

Testing Checklist

Related Documentation

Priority

Proposed Timeline

Contact

Appendix: SDK Team Response

Related topics