Hello,I have a transcription service using whisper to transcribe audios, and I’m really happy with the service so far, however, sometimes the API will transcribe for over 60s (over which the request is stopped), but retrying completes the job in about 5s, see attached logs. All times are in GMT+2.Affected request ids:req_01jynty635fhwrw808qwwcge08 req_01jynty5f9e1vsfryhbthev5bz req_01jynty2y4e1nrz7yyk67pwmde req_01jynty2x0fhra7njk896aztwtSome requests a few days ago:req_01jxmetn46fkgr342pnv2f9v9k req_01jxmbmg4weg0ant4cs9q3w56j (this one had a high TTFT) As you can see, all audios are <20mins in length. They are all compressed in 16kbps opus format, so they are at most 33.5MBs in size. I haven’t been able to reproduce this error with my own audios so unfortunately I can’t share any, but do let me know if you need some metadata

Whisper taking incredibly long (over 60s), retrying takes less than 10s

Hello,

I have a transcription service using whisper to transcribe audios, and I’m really happy with the service so far, however, sometimes the API will transcribe for over 60s (over which the request is stopped), but retrying completes the job in about 5s, see attached logs. All times are in GMT+2.

Affected request ids:

req_01jynty635fhwrw808qwwcge08
req_01jynty5f9e1vsfryhbthev5bz
req_01jynty2y4e1nrz7yyk67pwmde
req_01jynty2x0fhra7njk896aztwt

Some requests a few days ago:

req_01jxmetn46fkgr342pnv2f9v9k
req_01jxmbmg4weg0ant4cs9q3w56j (this one had a high TTFT)

As you can see, all audios are <20mins in length. They are all compressed in 16kbps opus format, so they are at most 33.5MBs in size. I haven’t been able to reproduce this error with my own audios so unfortunately I can’t share any, but do let me know if you need some metadata

Page 1 / 1

Thank you for reporting this, I’ll take a look and get back to you

Hi, is there an update?

I’m having trouble reproducing this; are you still running into these errors, and do you know if it’s triggered by a specific codec format / file size / language?

Up until July 4th yes, thereafter we have reduced the timeout to 20s, which may be a bit aggressive but it has only slightly increased the occurrence of errors.

The audios are all transformed by the following command:

/opt/ffmpeg-layer/bin/ffmpeg -ss ${start} -to ${end} -i /tmp/input \
    -vn \
    -map_metadata -1 \
    -ac 1 \
    -c:a libopus \
    -b:a 240k \
    -application voip \
    -compression_level 0 \
    -threads 0 \
    -y \
    /tmp/output.ogg

We divide up each audio in chunks of 20 minutes (+15s of leeway) and then upload each of them separately to Groq. As you can see the format is always the same, opus @ 240kHz bitrate giving a consistent . The final bit that is less than 20 minutes has not had any problems. I will try lowering the length to about 15 minutes and the quality to 128kbps and see if that helps.

If it may help in investigating, the audios we get are usually recorded with a phone and from far away from the speaker

Reply

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded