Groq latency fluctuates between 300ms and 20s

Groq is pretty unstable sometimes. Is there a way to fix this? My latency is sometimes at five seconds. Sometimes at 400 milliseconds. It's really not stable enough for me to push into production.

Please don't get me wrong, but I love Groq. When Groq is acting well, I'm brutally impressed by the speed. It's just not stable or robust enough. I would love to find a way to mitigate this.

Hi ​@jetsonearth I’ll trace your requests and see what’s going on; could you please give me your:

  • request ids
  • what models you used
  • system/message/tool call prompts so I can reproduce on our side

If it’s sensitive, you’re welcome to email me instead: jzheng@groq.com

Best,

Jan

we have the same issue here… we’d love to keep using groq with kimi, but for now we switched back to Azure & Anthropic’s API. I’ve sent you an email @jan

We’re seeing the same behaviour, it’s a significant blocker to moving to production as well. I’m happy to share traces in private, please reach out to luca@learnwise.ai

Same for me, I’ve been using Grow for over a year and today I got a very significant delay using gpt-oss 120b. I don’t have more info now but I can try reproducing and sharing traces here.

Sorry about this, I’ll follow both of you up separately for tracing the perf degradations

I’ll dig into oss-120b perf degradation, sorry about that. Our usage can sometimes be spiky which causes requests to be queued up, but I’m checking if there was any performance degradation in the model.

For a quick check, could you pull up Metrics - Dashboard - GroqCloud to see if you have a larger # of request failures?