Model Request: Parakeet v2 0.6b Model

(Posted by peacock on the Groq Discord: https://discord.com/channels/1207099205563457597/1370606104966594680)

Parakeet v2 0.6b can transcribe 1hr of audio in just 1 second

https://huggingface.co/spaces/nvidia/parakeet-tdt-0.6b-v2

https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2

Only English Supported

Is this on roadmap?

I second this. The parakeet model is indeed impressive--it blew me away when I tested it. Amazing model for local use; it has very good accuracy while being many, many times faster than Whisper. It was like running Whisper Large with instant inference on CPU! This would enable extremely fast transcription on Groq at an unbeatable price point.

We’re constantly on the lookout for new STT models!

My only gripe with this (and other STT / TTS model) is how it’s only available in English.

@yawnxyz :

Both the new Canary-1B-v2 and parakeet-tdt-0_6b-v3 models are multi lingual. Please implement them.

We’re considering them along with a few others!