Support for Gemma 3

enrico_stauss · August 14, 2025, 9:17am

Hi @Groq Team,

We're currently building with Groq and the high inference speed is truly a significant enabler for our use case.

I wanted to ask whether there are plans to add Gemma 3 to your supported models anytime soon. I’d like to provide a few arguments from our perspective:

As we're based in the EU, we currently don't have access to any multimodal models for image analysis on your platform, which significantly limits our capabilities.
Gemma 2 9B was recently deprecated – wouldn't Gemma 3 12B be a natural replacement rather than falling back to the older Llama 3 8B? This seems especially relevant since Llama 3 8B runs slower than the newer Llama 4 models (which we, again, can't access due to regional restrictions).
Gemma 3 genuinely performs well in our testing. Having both the 12B and 27B variants (or even 4B and 27B) would allow us to intelligently route requests based on complexity.
Your catalog increasingly focuses on reasoning models, but some production tasks actually perform as good or better with traditional, non-reasoning models.

I hope you'll consider this request. While day-one support for the latest models is exciting, having access to solid, reliable workhorses like Gemma 3 would provide tremendous value for teams building real applications. Sometimes the proven performers are worth their weight in gold.

Thanks for considering this!

yawnxyz · September 2, 2025, 5:27pm

Hi @enrico_stauss thank you for the model request, we’ll consider adding the model!

indiedeveloper · October 3, 2025, 8:29pm

You can technically use llama 4 series from EU, because groq it’s a US company. LLAMA ( Meta ) license it’s clear, you can’t deploy on EU but can use it as inference.

By the way google/gemma-3n-E4B-it it’s very strong, please add

enrico_stauss · October 6, 2025, 8:17am

I’m not so sure about it (https://www.llama.com/llama4/use-policy/):

With respect to any multimodal models included in Llama 4, the rights granted under Section 1(a) of the Llama 4 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models.

Maybe if I were the end-user of the Groq API, but if I use Groq as an inference provider for a service, then I’m not the end-user.

indiedeveloper · October 6, 2025, 8:34am

This paragraph refers to the use on a machine (which is not yet available in the EU, but only in the USA). As a consumer, even for commercial purposes, you can use it without any problems in Europe. I tell you this with absolute certainty; we are using it for our start-up, and I have received confirmation from our lawyers.

yawnxyz · October 7, 2025, 9:08pm

Usage note: With respect to any multimodal models included in Llama 4, the rights granted under Section 1(a) of the Llama 4 Community License Agreement are not being granted to you by Meta if you are an individual domiciled in, or a company with a principal place of business in, the European Union.

From what I understand, the Llama 4 license technically/legally shouldn’t extend to any EU users, but if your legal team has given the go-ahead, then you should defer to them! This is what our legal team has told us.

indiedeveloper · October 16, 2025, 8:17pm

In EU we can use, just we can’t install it on EU server. And there is a limitation if earn are 700mln +, in this case need meta agreement.

By the way, any possibility adding google/gemma-3n-E4B-it for the vision ? it’s very strong and cheap

yawnxyz · October 17, 2025, 4:15pm

Ah interesting, thank you for the clarification!

Re: Gemma 3 — we’re probably leaning towards adding a Qwen vision model over Gemma, but still evaluating all options. Stay tuned!

indiedeveloper · October 18, 2025, 11:50am

Ohh amazing, A possible cost per input/output token? Small or large model?

yawnxyz · October 19, 2025, 8:22pm

Honestly your guess is as much as mine — a ton of choices like that go into the final decision of what model/s get launched and when by the core Engineering team. Believe me when I say I’m probably just as excited as you to see us launch another multi-modal model soon!!

enrico_stauss · October 20, 2025, 6:32am

Thanks for the update @yawnxyz. If you find Qwen to provide higher quality then I’m on board. I just wanted to raise awareness again, that my request was not only for multimodality, but also for a new SLM that can offer very high throughput at decent quality for structured prediction. For a slightly advanced tool signature, Llama3.1-8B did not work reliably, which is why I switched to Qwen3-32B without reasoning, but by now, neither of the 2 are the fastest among the flock. This is why I suggested 2 variants from the Gemma family (or Mistral in the other thread). But a combination of a small(er) Qwen variant + a larger multimodal one sounds just as good.

yawnxyz · October 20, 2025, 4:35pm

Thanks for the suggestion — yeah very fast structured prediction is a big use case for us, and we’re evaluating a bunch of these models now, as well as working hard on constrained decoding. No ETA unfortunately, but we’re cooking!

4mish · October 21, 2025, 9:25am

Just signed up and was hoping to run Gemma 3 12B on your platform for a production project. This led me here.

For our specific use case, Gemma outperforms all other similarly sized models, so there’s definitely a need.

indiedeveloper · October 21, 2025, 10:25am

this one dropped by deepseek yesterday can be the SOTA for OCR

yawnxyz · October 21, 2025, 4:50pm

Oh that’s so interesting, in our tests/evals/vibes Gemma 3 seems fairly mediocre, so we haven’t really been pushing it very hard internally.

If you’re able to share, I’d love to hear a bit more about your general use case of Gemma 3 - eg what field your use case is in, what jobs is it doing (e.g. writing code)?

yawnxyz · October 21, 2025, 4:53pm

I’ve been looking at it and it’s so interesting (from the information compression side; an image is worth a few hundred tokens??)

To me it seems more like a research project on compression and data generation tool, rather than meant to be for pure OCR? The performance does look good though.

I’m not sure if the team is keen on launching just an OCR model though

4mish · October 21, 2025, 7:39pm

Image recognition. More specifically, extracting ingredients from pictures of food and estimating their weight.

Gemma performs very well for this task, even the 4B version.

yawnxyz · October 21, 2025, 8:15pm

Oh good to know; we’re definitely prioritized on getting a strong new image model, but it’s not clear which one we’ll be settling on and releasing though, and it doesn’t look like it will be Gemma unfortunately

4mish · October 22, 2025, 6:32am

I would pick any other vision model as long as it’s efficient and supports LoRA. Right now, anything is better than nothing.

yawnxyz · October 22, 2025, 4:19pm

We’re on it!

(We do have llama-4 right now for vision though…)

Topic		Replies	Views
What models do you want to see on Groq? Feature Requests	30	562	October 29, 2025
Support GLM-4.5 and GLM-4.5V Feature Requests	7	532	October 24, 2025
Model Tool Use Feature Requests	5	57	July 28, 2025
Chaining Models Support Feature Requests	5	41	September 30, 2025
Qwen3-32B is Now LIVE on Groq! Groq News	1	49	June 11, 2025

Support for Gemma 3

Related topics