Model Request: Parakeet v2 0.6b Model

(Posted by peacock on the Groq Discord: https://discord.com/channels/1207099205563457597/1370606104966594680)

Parakeet v2 0.6b can transcribe 1hr of audio in just 1 second

https://huggingface.co/spaces/nvidia/parakeet-tdt-0.6b-v2

https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2

Only English Supported

Is this on roadmap?

I second this. The parakeet model is indeed impressive--it blew me away when I tested it. Amazing model for local use; it has very good accuracy while being many, many times faster than Whisper. It was like running Whisper Large with instant inference on CPU! This would enable extremely fast transcription on Groq at an unbeatable price point.

We’re constantly on the lookout for new STT models!

My only gripe with this (and other STT / TTS model) is how it’s only available in English.

@yawnxyz :

Both the new Canary-1B-v2 and parakeet-tdt-0_6b-v3 models are multi lingual. Please implement them.

We’re considering them along with a few others!

Yes, please. The Whisper is great, but as I understand it, the Parakeet models are far superior for European languages.

I’ve been looking at my options because Groq doesn’t do this. I would have to set up all my own infrastructure which is obviously a big headache. And even then, the real-time factor would be far inferior to Groq

1 Like

That makes sense, we’re not launching Parakeet right now but we ARE working to launch something new soon. Stay tuned!!

Hi, how can I stay updated on the release of the “something new”? What is the best channel to remain in the loop as soon as it comes out?