Hi everyone, thanks for your patience on this issue.
I’ve been investigating this over the last week and it looks like we have a bug in our handling of tool_choice=required for the GPT-OSS models. On all other models, we return a 400 error when the model doesn’t call a tool when it’s required. On GPT-OSS models, it doesn’t return an error, but it should.
I understand that this doesn’t solve the issue where it doesn’t call tools even when tool_choice is required. A model’s ability to call tools is based on its post-training and user prompting. On Groq’s side, we aren’t able to force a model to call a tool/function, even when you set tool_choice=required.
In the most recent example above (thank you so much for these reproductions - it makes it much easier to investigate), this is what the model sees:
- It sees the entire prompt given to it. This includes the system prompt, previous chat messages, tool calls and tool call results, and tool definitions.
- The system prompt has the most weight here, but it doesn’t instruct the model on how to use tools or think about how to handle user input
- The model sees that it already tried to use a tool to find the order state, but nothing was found
Given all this information, the model doesn’t see any reason to call another tool. It already called the only relevant tool and got a result. There are no prompts or instructions that tell the model that it should either search again (call the same tool) or to call the search_knowledge_base tool. If you are expecting the model to search the knowledge base, it needs to know when to do so. Right now, it doesn’t know that you might expect it to search more, or to try getting the order state again. It doesn’t have any information on when to do this.
My recommendation for this specific case would be:
- Add instructions in the system prompt to tell the model what to do when the order status isn’t found. What are you expecting it to do? Should it call the tool again, or call a different tool? Be explicit.
- Add more details to your tool definitions. Usually 3 sentences is a good length. This will tell the model more about each tool - what it does, when to use it, and the inputs that are expected. For example, you might modify the knowledge search tool to include something like “Use this tool if you are unable to find order state.”
Please let me know if you have any other questions or example requests you’d like me to look into! I understand that this can be frustrating when you expect the model to call tools and it doesn’t. There’s a lot going on here under the hood - setting tool_choice=required is more like “please call a tool”, and our backend should validate that a tool was called, but will not force the model to call a tool. A lot of this comes down to the model’s capability to call tools and the prompting structure.