Nemotron 3 series

Groq is famous for combining high intelligence with extreme speed, historically relying on the GPT‑OSS series. But open‑source AI moves fast, and newer models have already surpassed that older architecture.

The NVIDIA Nemotron‑3 Nano is the clear successor. It is significantly smarter and more knowledgeable than the GPT‑OSS‑20B, yet it runs twice as fast and is cheaper to operate. It hits the Groq sweet spot perfectly: maximum speed, zero loss in quality.

The rest of the Nemotron family is just as impressive. The larger models outperform our current 120B options, and they all offer a massive 1M‑token context window, totally eclipsing the old 128k limit.

Finally, this partnership just makes sense. NVIDIA builds the hardware and training stacks that power modern AI. Running their flagship open models on Groq’s superior inference engine is a natural fit.

Bottom line: Nemotron‑3 is faster, smarter, cheaper, and ready for the future. It belongs on Groq.

3 Likes

Just read up on their release - really interesting model, can’t wait to play with it!!

Which models are you referring too?

I think they’re talking about Nemotron 3 Super and Ultra, which I haven’t tried yet.

If the 100B model aka Super is performing better as well, then it’s interesting indeed. @yawnxyz I assume the issue on Groq’s end is the LPU memory? Means you can’t serve a lot of different models because you need to allocate a lot of LPUs per model?

Well, we compile the models into our data centers, and they do take up “real estate” on our chips, so we do try to just serve the best/fastest models (rather than all of them)

Looks like I was right considering this new $20B deal with NVIDIA. Im excited to see how this shapes the future of model performance on groq, specifically ones built by NVIDIA.

1 Like