Kimi K2 deprecation - what's the point of building on Groq if models keep disappearing?

jdent · March 25, 2026, 5:09pm

Just got the email that Kimi K2-0905 is being deprecated April 15th. This is becoming a pattern.

Since the Nvidia acquisition, it feels like Groq has been in a cycle of pulling models one by one - with no compelling replacements lined up. The suggested migration path to GPT OSS 120B is not equivalent, and “check our model list” isn’t a transition plan.

The entire value proposition of Groq was speed + reliability. The speed is still unmatched, but what good is fast inference if you can’t count on a model being available 6 months from now?

Nobody wants to re-engineer their prompts, evals, and pipelines every quarter because the underlying model vanished.

For those of us running production workloads on Groq: what alternatives are you looking at?

Specifically interested in providers that can come close to Groq’s latency while offering more model stability. OpenRouter is the obvious one but throughput and latency aren’t in the same league.

A few things I’d want from any alternative:

Sub-second TTFB on mid-size models
Commitment to model availability windows (minimum 12 months)
Transparent deprecation policy with adequate migration timelines

And to the Groq team, if you’re reading: your hardware is incredible. But the model churn is making it impossible to recommend Groq for anything beyond prototyping. A public model lifecycle policy would go a long way.

itstyrion · March 26, 2026, 1:42am

“n line with our commitment to bringing you cutting-edge models, on March 23, 2026, we emailed users to announce the deprecation of moonshotai/kimi-k2-instruct-0905 in favor of openai/gpt-oss-120b" ah yes, the famously cutting edge gpt oss 120b vs the inferior k2 or k2.5

how about instead axing the old K2-instruct to free up capacity, like it was planned for oct 2025

Jacob_Ennis · March 26, 2026, 11:31pm

Agreed. We have a massive enterprise software that requires exceptionally low latency and a model as robust as Kimi K2 Instruct. Not only is the deprecation unwarranted, Groq’s suggestion to use GPT-OSS-120b is asinine and insulting. Do they even understand the models that they’re hosting? If there’s another service offering similarly low latency, then I’m ready to make the switch. Obviously Groq isn’t up to enterprise standards as a provider anymore.

tmq · April 3, 2026, 5:56pm

I’m using K2-0905, and the speed is why I chose Groq.
I’ve benchmarked every model on OpenRouter for my service, and K2—including Kimi K2.5—was the best for me. I switched to Groq for JSON schema support. (For some reason, OpenRouter’s Groq doesn’t support JSON schemas, so I had to move to Groq directly.) Then I saw this deprecation notice.

Muhammad_Talha · April 14, 2026, 7:49am

Our Whole product that is serving millions in GULF that is specialized in Agentic calling is dependent on Kimi- k2. If you are depreciating a model then at least provide a better alternative like kimi k2.5 or GLM 5 not GPT-OSS 120B which underperforms by a huge margin.

Gees · April 15, 2026, 3:57am

Firstly, to answer your question there’s no point building on Groq if this pattern continues.

Just 5 weeks ago they deprecated Llama 4 Maverick 17B 128E, and you know what they suggested to replace it, GPT OSS 120B.

Today they deprecate Kimi K2-0905 1T 256k, and you know what they suggested, that’s right, GPT OSS 120B!

Of course there is a pattern here and as a developer I don’t like it.

These models are completely different in the way they handle structured outputs, their interpretation of prompts and general feel. There is no way in hell I’m gonna reengineer my 6 months of Llama prompts to OSS, I just don’t like their models.

As soon as I find a low latency provider of the various Llama models I will move, that’s for sure.

Yes, I’ve discussed with with them, but their support keeps recommending OSS.

It seems Groq just don’t know who their target audience is or are playing games since the takeover.

marcel · April 24, 2026, 12:26am

It’s now obvious that NVIDIA is killing Groq. There hasn’t been a single new model announced since six months, and they’re only pretending to care about our opinions. In fact, there’s no developer left to implement new models.

itstyrion · April 25, 2026, 11:04pm

My thoughts exactly. gpt oss 120B is a Grade A yapper that LOVES spamming bulleted lists and tables. and it loves failing tool calls.
it’s not terrible but it’s also not great.
I loved having a good model with great speed at hand in Zed, for inline and chat/agent.

and now I see minimax was added, but locked to enterprise only -_-

Topic		Replies	Views
We need kimi k2.5 asap! Feature Requests	20	2519	April 1, 2026
Any clarification from Groq team about the future? Forum	14	1366	May 3, 2026
Deprecation of Kimi K2-0905 1T 256k Feature Requests	0	161	April 1, 2026
Please do not deprecate the kimi-k2 model Forum	1	206	September 15, 2025
What models do you want to see on Groq? Feature Requests	115	5441	May 17, 2026

Kimi K2 deprecation - what's the point of building on Groq if models keep disappearing?

Related topics