Chaining Models Support

sean_carmody · June 10, 2025, 9:12pm

Support for chaining multiple model requests into a single API call would be fantastic. Groq’s inference speed means the primary latency bottleneck is network I/O when making sequential model calls. Enabling request chaining would reduce network calls, making things faster and more efficient for workflows that need several model inferences in a row!

yawnxyz · June 11, 2025, 3:37pm

That’s a great feature request! I’ll bring that up to the engineering team!

What kind of chaining are you looking for, e.g. First call to Scout extracts a summary, and Second call converts the output from the first response JSON or translates to French?

sean_carmody · June 11, 2025, 4:20pm

Thanks I appreciate that!

I'd imagine there are a whole host of use cases including the one you suggested. Mine is for building a conversation bot for personal use. Right now I need to do:

I speak -> call whisper model -> send latency -> inference -> receive latency -> call LLM model -> send latency -> inference -> receive latency -> call text to speech model -> send latency -> inference -> receive latency -> play output

Internet I/O is the big bottleneck. What would be fantastic and remove the bottleneck would be:

I speak -> I call all three models in a chained API call -> send latency -> inference -> inference -> inference -> receive latency -> play output

There must be so many use cases that would benefit from this functionality!

yawnxyz · June 12, 2025, 3:44pm

Thanks for the detailed explanation — the engineers agree and are looking into adding chaining as a feature!

sean_carmody · June 12, 2025, 4:21pm

That's fantastic to hear, I appreciate that! Hopefully will benefit a lot of workflows!

lolrazh · September 30, 2025, 7:34am

+1 for this. I’d love to see something like that.

Topic		Replies	Views
What models do you want to see on Groq? Feature Requests	98	3103	March 2, 2026
How do I implement function calling with Groq? FAQs	1	87	August 8, 2025
We need GLM-5 and Qwen 3.5(397B and 122B) ASAP! Feature Requests	0	35	February 26, 2026
Parallel Tool Use with Groq API Tutorials	3	499	March 2, 2026
Add speech-to-speech model support Feature Requests	3	119	January 8, 2026

Chaining Models Support

Related topics