Add speech-to-speech model support

yawnxyz · May 30, 2025, 4:19am

(Posted by Aly on the Groq Discord: https://discord.com/channels/1207099205563457597/1376241264101425273)

Hi Groq Team,

I'd like to request the addition of support for speech-to-speech (or voice-to-voice) models like ultravox on the Groq inference engine.

Thanks!

ljbred08 · August 15, 2025, 10:20pm

Just to clarify, Ultravox is a speech-to-text LLM model, not speech-to-speech. However, speech-to-text LLMs such as Ultravox and Microsoft Phi-4 Multimodal would still be helpful (at this point probably more so than speech to speech, not to mention that I am not aware of many open-source speech-to-speech models.)

yawnxyz · August 24, 2025, 2:40am

Ah yes you’re right; we’re considering those, but yeah there’s no sts models as far as I’m aware of, but that would be so cool

firewolf · January 8, 2026, 4:33pm

I would like to throw in my hat and suggest Minstral’s Voxtral for a speech-to-text model

Topic		Replies	Views
What models do you want to see on Groq? Feature Requests	110	4339	April 14, 2026
Please support new Canary-1B-v2 and parakeet-tdt-0_6b-v3 Speech to text models Forum	1	203	August 31, 2025
Model Request: Parakeet v2 0.6b Model Feature Requests	8	299	March 3, 2026
Nvidia/diar_streaming_sortformer_4spk-v2 support please Feature Requests	0	143	September 29, 2025
What are the main API endpoints? FAQs	0	111	August 8, 2025

Add speech-to-speech model support

Related topics