What models do you want to see on Groq?

Ukuro_Xuan · December 12, 2025, 9:18am

deepseek new model, which can switch bewteen think/no-think is convenient

yawnxyz · December 12, 2025, 7:48pm

Wait, are y’all talking about this model? deepseek-coder-v2

Lernex · December 13, 2025, 4:05am

Id love if you hosted some of the service now AI’s. Like Apriel v1.6 15b Thinker. I think it would be an excellent model to provide.

Zouzitou · December 13, 2025, 2:22pm

Yea, if its possible cant we get the 236b one (i think) since that one is crazy for autonomous coding. I want it so bad xD

Ukuro_Xuan · December 14, 2025, 4:07am

no i’m saying deepseek v3.1 or v3.2. deepseek v2 series is the old version

Ukuro_Xuan · December 14, 2025, 4:23am

And I also have some suggestions that I can discuss with the community members. First of all, I think gpt-oss is a model that performs well in testing but is quite average in actual use. Its code has many errors that are hard to bear looking at, and there are very strict content policy restrictions that affect the actual experience. Maybe we can take down these models, free up server resources, and put up those models that are really useful. I also recommend a model called glm-4.6. Its previous generation, glm4.5, has good reviews for its coding ability, which can be on par with Claude Sonnet4, and its cost is not high.

Ukuro_Xuan · December 14, 2025, 4:43am

yawnxyz · December 15, 2025, 9:53pm

We’ve been evaluating the lot - OSS is a decently fast model that balances code writing / tool use / data extraction, that many of us use day to day! It’s definitely more corporate leaning but they’re still very popular!

Zouzitou · December 17, 2025, 6:06pm

v2 coding not v2 general xD

Zouzitou · December 17, 2025, 6:10pm

They are good, not coding specified tho.
Im using OSS right now for my AI coding project, and its good. Only thing is, i want a model thats specified for coding since it has fewer errors, better syntax, cleaner code (you get it).

Anyways, thanks for listening to the community ;=)

(sorry if i replied several times i got so confused with the system )

yawnxyz · December 17, 2025, 6:50pm

Yeah Discourse takes a bit to get used to. We’re cooking a coding model, announcements soon!

buddycalvin · December 18, 2025, 7:23pm

All the models on Groq suck for building apps, besides maybe Kimi K2 as a proof of concept…

Gemini 3 Flash just launched, Opus 4.5 and Composer-1 from Cursor are way too good for coding stop wasting your time on coding the war is lost.

They all use Google’s TPUs, so the tps is fast enough now, possibly faster than groq.

Where inference is slow and users are willing to pay today are in images and video inference. Open-source models there are amazing, and LoRa’s for styles, skin fixing, upscaling, and all sorts of stuff are taking off.

If you added an image model like FLUX 2 pro and it was fast, a video model like Wan 2.2-2.6, a video editing model like SCAIL… Now that would be an amazing API as a dev.

yawnxyz · December 18, 2025, 7:56pm

Opus, Sonnet, and Composer are still some of the best models for writing code; Composer-1 is a fine-tuned open source model optimized for Cursor and works really well in that environment.

They don’t all use TPUs exclusively — Anthropic does use a few TPU clusters but it’s to beef up their overall availability (it’s actually quite interesting how they have to juggle the differences between GPUs and TPUs)

For coding, open source models definitely falls behind Opus and Sonnet, but makes it up for price and speed — so if you’re doing stuff like scanning for vulnerabilities, Opus/Sonnet will very quickly burn a hole in your pocket (Composer can’t be used via API) — that’s where open source models can step in.

Viddeo and image model gen is interesting, right now from a business perspective it’s not on our radar though (we’re mostly going for big “boring” business use cases right now)

syntaxbullet · December 18, 2025, 8:53pm

Z-image as the image model?

Zouzitou · December 19, 2025, 7:47pm

, hope it goes well

harsh020 · December 29, 2025, 8:35am

Would love to see following models (it might be a long list):

Text-to-Speech: Chatterbox Turbo, Kokoro-82M, VibeVoice (0.5B, 1.5B, Large)
Speech-to-Text: zai-org/GLM-ASR-Nano-2512, nvidia/parakeet-tdt-0.6b-v3

Some Text-to-Image models as well

Xerxes · December 31, 2025, 10:20am

Please Please Please add GLM 4.7, I love groq everything is awesome but there is a lack of capable model here. OSS-120B is a good general model but it doesn’t match quality of GLM models. If you are testing it, please make it experimental so we can experiment with it too.

Maheer_Babar · January 4, 2026, 4:56am

Some Embedding models please.. I keep having to get multiple subscriptions it would be nice if could just stick to groq

Shijie · January 4, 2026, 6:15am

Could you please consider supporting this open-source model:
https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash

MiMo-V2-Flash uses a GPT-OSS–style decoder with hybrid attention (5 SWA layers per full-attention layer), which significantly reduces KV-cache size and improves decoding speed. The model shows competitive overall quality (https://artificialanalysis.ai/ , MiMo-V2-Flash (free) - API, Providers, Stats | OpenRouter ).

With MTP enabled on GPU, it already reaches ~150 tokens/s. Groq definitely can achieve much higher speed! We believe Groq can also benefit from MTP and MiMo-V2-Flash would be a strong model to showcase Groq’s inference throughput advantages.

Jyotiraditya_Mayor · January 18, 2026, 12:12am

Qwen3 VL family of models (text generation, embeddings, rerankers)

Seedream (Image gen)

Topic		Replies	Views
Qwen3 Coder & Qwen3 235B A22B Thinking 2507 Feature Requests	9	426	December 31, 2025
DeepSeek Coder V2 Feature Requests	6	136	December 21, 2025
Add speech-to-speech model support Feature Requests	3	159	January 8, 2026
We need kimi k2.5 asap! Feature Requests	20	2366	April 1, 2026
Welcome to the Help Center! FAQs	1	210	August 8, 2025

What models do you want to see on Groq?

Related topics