We’d love to hear what kind of models you’d love to see on Groq — from coding models to text-to-speech and speech-to-text, to embeddings and diffusion and other sorts of models.
While we can’t accommodate everyone’s wishes, we’d love to keep the conversation alive here with new model drops, and model benchmarks and performance in this thread!
Considering cost-benefit I’d say that AI products for growing applications. (excelent princing and good quality) or https://murf.ai/ (good price, better quality). There are others, but for heavy usage I think that no one beats those two cost-benefit wise.
We are on the look-out for an elastic, scalable environment for our speech-to-text inference workloads. We use the NB-Whisper models - based on Whisper from OpenAI and trained further on Norwegian speech data - from NbAiLab ( NB-Whisper - a NbAiLab Collection ).
We are currently serving these from a dedicated H100 environment. Due to client consumption growth we are looing for a more compute-efficient solution, and we would love to try out groq. But we are entirely dependent on using these Norwegian-specific models, and would like to host them from the Nordics (read; Helsinki).
That’s really interesting, I didn’t know about a Norwegian specific Whisper! This is probably too niche for us to host for the immediate future, but as a fellow Nordic (Swede!) it’s really cool to see a Norwegian-tuned Whisper. Probably every language (and dialect) will have their own speech-to-text model in the future!
I brought it up before in another thread but for the sake of visibility, I’ll post it again here.
I’m desperately missing EU compliant VLMs (and non-reasoning LLMs) and I’m also desperately missing capable SLMs with very low latency and high throughput. I’m currently still on Llama3.1-8B but this is apparently slower then the Llama4 series and it also is not producing json-structured output reliably.
I therefore suggest the Mistral and/or Gemma3 family of models. A combination of Mistral Small 3 and Mistral Medium 3 should be rock solid.