Kimi K2 deprecation - what's the point of building on Groq if models keep disappearing?

Just got the email that Kimi K2-0905 is being deprecated April 15th. This is becoming a pattern.

Since the Nvidia acquisition, it feels like Groq has been in a cycle of pulling models one by one - with no compelling replacements lined up. The suggested migration path to GPT OSS 120B is not equivalent, and “check our model list” isn’t a transition plan.

The entire value proposition of Groq was speed + reliability. The speed is still unmatched, but what good is fast inference if you can’t count on a model being available 6 months from now?

Nobody wants to re-engineer their prompts, evals, and pipelines every quarter because the underlying model vanished.

For those of us running production workloads on Groq: what alternatives are you looking at?

Specifically interested in providers that can come close to Groq’s latency while offering more model stability. OpenRouter is the obvious one but throughput and latency aren’t in the same league.

A few things I’d want from any alternative:

  • Sub-second TTFB on mid-size models
  • Commitment to model availability windows (minimum 12 months)
  • Transparent deprecation policy with adequate migration timelines

And to the Groq team, if you’re reading: your hardware is incredible. But the model churn is making it impossible to recommend Groq for anything beyond prototyping. A public model lifecycle policy would go a long way.

11 Likes

“n line with our commitment to bringing you cutting-edge models, on March 23, 2026, we emailed users to announce the deprecation of moonshotai/kimi-k2-instruct-0905 in favor of openai/gpt-oss-120b" ah yes, the famously cutting edge gpt oss 120b vs the inferior k2 or k2.5

how about instead axing the old K2-instruct to free up capacity, like it was planned for oct 2025

4 Likes

Agreed. We have a massive enterprise software that requires exceptionally low latency and a model as robust as Kimi K2 Instruct. Not only is the deprecation unwarranted, Groq’s suggestion to use GPT-OSS-120b is asinine and insulting. Do they even understand the models that they’re hosting? If there’s another service offering similarly low latency, then I’m ready to make the switch. Obviously Groq isn’t up to enterprise standards as a provider anymore.

5 Likes

I’m using K2-0905, and the speed is why I chose Groq.
I’ve benchmarked every model on OpenRouter for my service, and K2—including Kimi K2.5—was the best for me. I switched to Groq for JSON schema support. (For some reason, OpenRouter’s Groq doesn’t support JSON schemas, so I had to move to Groq directly.) Then I saw this deprecation notice.