What models do you want to see on Groq?

a LLM text-to-text only, high rate limits for that model

Please add GLM 4.7 , It will be great on your fast hardware.

1 Like

We’re all really desperate for a new model on Groq. Top choices right now:
-GLM 4.7
-Kimi K2.5

Pleeeeeeeease :slight_smile:

8 Likes

I would love to see Kimi k2.5!!

6 Likes

Definitely looking fotwrd for Kimi K2.5 to test agent swarm.

6 Likes

As some others have indicated, additional/advanced ASR models would be really fantastic—especially those which can handle diarization (i.e. properly splitting out multiple speakers) and timestamps.

Models which seem most promising:

Kimi K2.5 please. Would love to this model running.

2 Likes

Kimi K2.5 is Absolutely needed. or any deepseek chat model

1 Like

allow people to host and run their own models then charge as usual per token, why wait for people to tell you what they want while they can do it themselves.

Kimi 2.5 please! No new models added since 6 months! Please add the latest Kimi models

Really all of the really big ones, like KIMI, GLM, DeepSeek, Qween, the best OCRs, the best TTS and STT.

Open source AI is flourishing and you are lagging behind real bad, after a very good start.

Mistrals newer models are pretty good and a great value. Codestral, and Devestral.
DeepSeek 3.2 and 4 when it comes out. GLM 4.5, 4.7, 5 are all really good for the price and with a proper harness they’re truly impressive.
Minimax 2.5

I would appreciate new models from huggingface like kimi k2.5 and minimax m2.5 etc.

Kimi 2.5 is good tho

2 Likes

Title: Add Moshi / J-Moshi speech-to-speech model support

What problem are you trying to solve?
Real-time full-duplex voice conversation with low latency (~200ms)

What would you like instead?
Support for Kyutai’s Moshi (kyutai/moshika-pytorch-bf16)
and J-Moshi (nu-dialogue/j-moshi-ext) for Japanese voice dialogue

Any workarounds you’re using now?
Running locally on 24GB GPU or Colab L4, which is expensive

Anything else we should know?
Moshi is Apache 2.0 licensed, 7B parameters, fits Groq’s LPU well

Qwen TTS, Kimi 2.5 would be amazing.

2 Likes

Kimi k2.5 please and soon :rocket: :rocket: :rocket:

2 Likes

I would like a code compatible model. For instance, I c’ant use groq inside my Kilocode vscode’s extension, because neither Gpt-OSS, Kimi or Qwen doesn’t work well
There’s probably open source code oriented in Hugging Face

Among the models, there must be one with vision! It would be a huge oversight if there weren’t.

1 Like

It would be really nice, if you could add the models from the latest Qwen3.5 model family, like Qwen3.5-35B-A3B. They are rather small and MoE-based, which guarrantees low latency, but they are told to perform better than gpt-oss-120b and even larger models. They have built-in reasoning, which however can be disabled. So these models could outperform and potentially even replace the current gpt-oss-120b and gpt-oss-20b models.

Also, desperately waiting for Qwen-Embedding-8B model to be deployed, since as far as I know, there are no decent production-grade low-latency deployments of this or similar models from other cloud providers. And it is crucial to have such a model for low-latency RAG systems such as voice assistants and others.

1 Like