Chunked Audio Upload for Speech-to-Text Processing

acraft · July 14, 2025, 5:59pm

Hello,

I have a feature request:
Chunked Audio Upload for Speech-to-Text Processing

Current Limitation:
The API currently requires the full audio file to be uploaded before processing, leading to increased latency.

Proposed Feature:
Add an API endpoint to support chunked audio uploads during recording, allowing processing to begin as audio is being sent.

Benefit:
Reduces upload latency, improving user experience.

This feature would significantly improve real-time processing capabilities and user satisfaction.

Please provide feedback on this feature request.

Thanks!

yawnxyz · July 14, 2025, 9:00pm

Hi there,

Thank you for the product feedback! We’ve been exploring adding features like Streamed responses back from the Whisper API, as it’s transcribing large files. We don’t have an ETA on when it’ll be launched though.

Best,

Jan

Robin_Diddams · November 12, 2025, 3:38pm

this would be absolutely killer for realtime audio processing agents. If i could upload an audio stream and then begin downloading the output of whisper at the same time. You could build a voip agent that starts tool calling the second a specific keyword leaves a persons’ mouth. please make this!

yawnxyz · November 12, 2025, 6:47pm

Can’t say any more yet but we’re working on these and extremely excited!!

Dovie_Weinstock · February 5, 2026, 3:10am

bump. Groq is ruled out for us if there’s no streaming. I think it’s a major oversight (unless there’s a good reason for it). Also, the minimum 10 second recording charge makes it less worth it to roll your own “streaming” for small recordings where latency is key.

Topic		Replies	Views
Nvidia/diar_streaming_sortformer_4spk-v2 support please Feature Requests	0	143	September 29, 2025
Chunking Longer Audio Files for Whisper Models on Groq Tutorials	1	902	August 20, 2025
Model Request: Parakeet v2 0.6b Model Feature Requests	8	299	March 3, 2026
I wish there was demucs model Forum	1	50	August 31, 2025
Is there a way to also get the translated text alongside transcriptions? Forum	1	15	April 8, 2026

Chunked Audio Upload for Speech-to-Text Processing

Related topics