Thank you for the product feedback! We’ve been exploring adding features like Streamed responses back from the Whisper API, as it’s transcribing large files. We don’t have an ETA on when it’ll be launched though.
this would be absolutely killer for realtime audio processing agents. If i could upload an audio stream and then begin downloading the output of whisper at the same time. You could build a voip agent that starts tool calling the second a specific keyword leaves a persons’ mouth. please make this!
bump. Groq is ruled out for us if there’s no streaming. I think it’s a major oversight (unless there’s a good reason for it). Also, the minimum 10 second recording charge makes it less worth it to roll your own “streaming” for small recordings where latency is key.