What models do you want to see on Groq?

Goodbird · March 5, 2026, 7:51am

It would be really nice, if you could add the models from the latest Qwen3.5 model family, like Qwen3.5-35B-A3B. They are rather small and MoE-based, which guarrantees low latency, but they are told to perform better than gpt-oss-120b and even larger models. They have built-in reasoning, which however can be disabled. So these models could outperform and potentially even replace the current gpt-oss-120b and gpt-oss-20b models.

Also, desperately waiting for Qwen-Embedding-8B model to be deployed, since as far as I know, there are no decent production-grade low-latency deployments of this or similar models from other cloud providers. And it is crucial to have such a model for low-latency RAG systems such as voice assistants and others.

Topic		Replies	Views
Qwen3 Coder & Qwen3 235B A22B Thinking 2507 Feature Requests	9	426	December 31, 2025
DeepSeek Coder V2 Feature Requests	6	136	December 21, 2025
Add speech-to-speech model support Feature Requests	3	156	January 8, 2026
We need kimi k2.5 asap! Feature Requests	20	2359	April 1, 2026
Welcome to the Help Center! FAQs	1	210	August 8, 2025

What models do you want to see on Groq?

Related topics