429 Rate limits with a single tool call

binabik · August 17, 2025, 12:42pm

Hi! I am testing tool calling with both gpt oss and kimi. After a simple tool call (no more than 100s of tokens in and out), the second call immediately gets a 429. Waiting anything from 10 to 30 seconds helps get it trough, but I don’t see what limits am I hitting. I would need to do 100s or 1000s of tool calls per minute as I see it.

Any ideas what I might be doing wrong? I am using groq’s own python library, with a 2-4k completion limit (tried both)

benank · August 19, 2025, 10:20pm

Hi there, are you using browser search? With reasoning_effort set to medium or high, it can often search a lot of webpages, filling up the context very quickly and causing a lot of tokens to be used.

Topic		Replies	Views
Groq kimi k2 tool call issues Forum	6	91	July 21, 2025
How to use a 10M context window? Rate limit issue Forum	1	51	June 19, 2025
Groq overcharging by 10x Forum	1	55	September 9, 2025
GPT-oss-120b Reasoning Tokens Not Counted in Responses API Usage Statistics Forum	2	75	September 15, 2025
Parallel Tool Use with Groq API Tutorials	2	126	September 8, 2025

429 Rate limits with a single tool call

Related topics