Support for Gemma 3

Hi @Groq Team,

We're currently building with Groq and the high inference speed is truly a significant enabler for our use case.

I wanted to ask whether there are plans to add Gemma 3 to your supported models anytime soon. I’d like to provide a few arguments from our perspective:

  • As we're based in the EU, we currently don't have access to any multimodal models for image analysis on your platform, which significantly limits our capabilities.
  • Gemma 2 9B was recently deprecated – wouldn't Gemma 3 12B be a natural replacement rather than falling back to the older Llama 3 8B? This seems especially relevant since Llama 3 8B runs slower than the newer Llama 4 models (which we, again, can't access due to regional restrictions).
  • Gemma 3 genuinely performs well in our testing. Having both the 12B and 27B variants (or even 4B and 27B) would allow us to intelligently route requests based on complexity.
  • Your catalog increasingly focuses on reasoning models, but some production tasks actually perform as good or better with traditional, non-reasoning models.

I hope you'll consider this request. While day-one support for the latest models is exciting, having access to solid, reliable workhorses like Gemma 3 would provide tremendous value for teams building real applications. Sometimes the proven performers are worth their weight in gold.

Thanks for considering this!

Hi @enrico_stauss thank you for the model request, we’ll consider adding the model!

You can technically use llama 4 series from EU, because groq it’s a US company. LLAMA ( Meta ) license it’s clear, you can’t deploy on EU but can use it as inference.

By the way google/gemma-3n-E4B-it it’s very strong, please add

I’m not so sure about it (https://www.llama.com/llama4/use-policy/):

With respect to any multimodal models included in Llama 4, the rights granted under Section 1(a) of the Llama 4 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models.

Maybe if I were the end-user of the Groq API, but if I use Groq as an inference provider for a service, then I’m not the end-user.

This paragraph refers to the use on a machine (which is not yet available in the EU, but only in the USA). As a consumer, even for commercial purposes, you can use it without any problems in Europe. I tell you this with absolute certainty; we are using it for our start-up, and I have received confirmation from our lawyers.

1 Like

Usage note: With respect to any multimodal models included in Llama 4, the rights granted under Section 1(a) of the Llama 4 Community License Agreement are not being granted to you by Meta if you are an individual domiciled in, or a company with a principal place of business in, the European Union.

From what I understand, the Llama 4 license technically/legally shouldn’t extend to any EU users, but if your legal team has given the go-ahead, then you should defer to them! This is what our legal team has told us.

In EU we can use, just we can’t install it on EU server. And there is a limitation if earn are 700mln +, in this case need meta agreement.

By the way, any possibility adding google/gemma-3n-E4B-it for the vision ? it’s very strong and cheap

1 Like

Ah interesting, thank you for the clarification!

Re: Gemma 3 — we’re probably leaning towards adding a Qwen vision model over Gemma, but still evaluating all options. Stay tuned!

Ohh amazing, A possible cost per input/output token? Small or large model?

Honestly your guess is as much as mine — a ton of choices like that go into the final decision of what model/s get launched and when by the core Engineering team. Believe me when I say I’m probably just as excited as you to see us launch another multi-modal model soon!!

Thanks for the update @yawnxyz. If you find Qwen to provide higher quality then I’m on board. I just wanted to raise awareness again, that my request was not only for multimodality, but also for a new SLM that can offer very high throughput at decent quality for structured prediction. For a slightly advanced tool signature, Llama3.1-8B did not work reliably, which is why I switched to Qwen3-32B without reasoning, but by now, neither of the 2 are the fastest among the flock. This is why I suggested 2 variants from the Gemma family (or Mistral in the other thread). But a combination of a small(er) Qwen variant + a larger multimodal one sounds just as good.

Thanks for the suggestion — yeah very fast structured prediction is a big use case for us, and we’re evaluating a bunch of these models now, as well as working hard on constrained decoding. No ETA unfortunately, but we’re cooking!

1 Like

Just signed up and was hoping to run Gemma 3 12B on your platform for a production project. This led me here.

For our specific use case, Gemma outperforms all other similarly sized models, so there’s definitely a need.

this one dropped by deepseek yesterday can be the SOTA for OCR

Oh that’s so interesting, in our tests/evals/vibes Gemma 3 seems fairly mediocre, so we haven’t really been pushing it very hard internally.

If you’re able to share, I’d love to hear a bit more about your general use case of Gemma 3 - eg what field your use case is in, what jobs is it doing (e.g. writing code)?

I’ve been looking at it and it’s so interesting (from the information compression side; an image is worth a few hundred tokens??)

To me it seems more like a research project on compression and data generation tool, rather than meant to be for pure OCR? The performance does look good though.

I’m not sure if the team is keen on launching just an OCR model though

Image recognition. More specifically, extracting ingredients from pictures of food and estimating their weight.

Gemma performs very well for this task, even the 4B version.

1 Like

Oh good to know; we’re definitely prioritized on getting a strong new image model, but it’s not clear which one we’ll be settling on and releasing though, and it doesn’t look like it will be Gemma unfortunately

I would pick any other vision model as long as it’s efficient and supports LoRA. Right now, anything is better than nothing. :slight_smile:

We’re on it!

(We do have llama-4 right now for vision though…)