Gpt-oss:20b calling tools unasked

hey, im making a python app which has a comand system with hashtags in the response, but instead of using my system it seems to try and use tools?

I get this error:
[Assistant]: [Error calling Groq chat completion: Error code: 400 - {‘error’: {‘message’: ‘Tool choice is none, but model called a tool’, ‘type’: ‘invalid_request_error’, ‘code’: ‘tool_use_failed’, ‘failed_generation’: ‘{“name”: “cmdoutput”, “arguments”: {“command”:“#lookup_spotify-Uprising-track*”}}’}}]*
I didnt have tools setup anywhere when i got it the first time, now i have this:
completion = client.chat.completions.create(

completion = client.chat.completions.create(

model=CHAT_MODEL,

messages=history,

max_tokens=max_tokens,

temperature=temperature,

tool_choice=‘none’,

disable_tool_validation=True )

thinking it would diable the tools but it didn’t

Could anyone help me prevent tools ( I already added a line in system instructions and rules that it’s not allowed to use tools)

The models sometimes misbehave and call tools if given tools; we’re adding constrained decoding soon which will solve most of these errors.

I’d be happy to show you how to avoid triggering this though; could you share a reproducible curl of your API call, and I’ll tweak it for you!

I (also) am experiencing this using gpt-oss-120b at groq. My implementation uses CrewAI with an LLM configuration of:

    self.agents_config = agents_config
    self.tasks_config = tasks_config
    self.api_key = os.getenv("GROQ_API_KEY")
    if not self.api_key:
        raise ValueError("GROQ_API_KEY not set in environment")

    self.llm = LLM(
        model="groq/openai/gpt-oss-120b",
        #model="groq/llama-3.3-70b-versatile",
        drop_params=True,
        additional_drop_params=["is_litellm"],
        temperature=0.7,
        top_p=1,
        reasoning_effort="medium",
        stream=False,
        stop=None,
        api_key=self.api_key,
        tool_choice="none",
        tools=[],
    )

And my agent and task configurations:

            Agent(
                config=self.agents_config["author_agent"],
                llm=self.llm,
                verbose=True,
                max_rpm=200,
                tools=[],
            )

            Task(
                config=self.tasks_config["rewrite_task"],
                output_file="result-{rowid}.csv",
                agent=agents[0],
                tools=[],
            )

You can see I have tried to convey the “no tools” aspect but I consistently run into:

RuntimeError: An error occurred while running the crew: litellm.BadRequestError: GroqException - {"error":{"message":"Tool choice is none, but model called a tool","type":"invalid_request_error","code":"tool_use_failed"

I am not sure how to convert any of this to a reproducible curl of my API call but I notice that with gpt-oss-120b or gpt-oss-20b this exception is encountered but switching to another model like that llama-3.3-70b-versatile never encounters it (but I need the quality of the output that the gpt models generate).

Well it’s been tricky but I think by downgrading to CrewAI 1.4.1 my crew no longer calls any tools (as requested). I had neglected to mention that I was using CrewAI 1.5.0 earlier when the issue came up.

1 Like

Oh interesting; I’ve been trying to repro the issue with the latest CrewAI with the exact same setup; I’ll try to play around with versions. I’m suspecting that CrewAI has some bump in system prompt that is getting the model to call tools like this — the OSS models are VERY susceptible to hallucinate or run tools while being told not to; if you’re able to write in caps in the system prompt “DO NOT USE ANY TOOLS AT ALL” while you have tool_choice=“none”, does that have an effect?

Well, I tried adding “DO NOT USE ANY TOOLS AT ALL” to my agent and task definitions and the problem -

RuntimeError: An error occurred while running the crew: litellm.BadRequestError: GroqException - {"error":{"message":"Tool choice is none, but model called a tool","type":"invalid_request_error","code":"tool_use_failed","failed_generation"

surprisingly occurred under CrewAI 1.4.1 (which was not happening before). And I left tool_choice=“none” in the LLM configuration.

Oh! But this is unexpected - I then upgraded to CrewAI 1.5.0 and left everything else the same (DO NOT USE…, tool_choice…) and this time I ran the same crew and no issues at all. :smiley:

Super glad you’re looking into it because it’s amazing when it works - and unfortunately depressing when it doesn’t. :slight_smile:

Oh, thought I should mention that both of these tests were run only once. But the crew reads its input data from a database and kicks off the crew once for each record (50 records).

crew = TimeBillcrew(
   agents_config=agents_config,
   tasks_config=tasks_config
).build_crew()

for row in fetch_rows():
   crew.kickoff(inputs=row)

Thanks!

By the way - I notice that you asked to add the “DO NOT USE…” to the system prompt (versus the user prompt). But in CrewAI I’m not sure how those are differentiated as its “prompts” are -

agent:
  role, goal, and backstory
task:
  description

I believe/think that the agent role/goal/backstory form the system prompt and the task description forms the user prompt but I don’t have direct knowledge that that is the case.

For the test I put the “DO NOT USE…” direction in the agent “goal” and the task “description”.

… and I ran the crew a second time under 1.5.0 and again no issues. Maybe the direction you suggested “convinces” it to avoid tool calling in 1.5.0 (but not 1.4.1)? :slight_smile:

Thought I would provide an update…

Unfortunately my good fortune 3 days ago was just an example of non-deterministic model behavior. :slight_smile:

The next morning I tried running the same 1.5.0 crew (same prompts, same data) and sure enough it went back to trying to call a tool despite all the attempts to get it to understand that there are no tools available (or needed). Even though the night before it had run fine - twice.

And today CrewAI version 1.6.0 came out, so I tried that and it still tries to call a tool and ends up getting the same RuntimeException.

However, I can reliably (!) downgrade my CrewAI to version 1.4.1 and it runs every time with no other changes. This has been reproducible for more than a week.

In the system prompt I added “DO NOT USE ANY TOOLS AT ALL” while having tool_choice=“none” for all these tests.

It seems like CrewAI does a lot of stuff to force the models to use tools, which I think is probably the “right way” for them to go — a lot of times models tend to avoid using tools, so they probably have a lot of system prompts / systems to force models to use tools. Of course, this goes against your situation to not use any tools.

Could you try this:

In my message prompt, I reinforce that it needs to always use a tool, but then I create a tool called catch_all with the description to always use this tool if no other tools match the request.

This seems to work in my testing — K2 seems to always call it when the user request doesn’t fit the existing tools — but I haven’t tried it in CrewAI yet.

I’m really eager to see if this helps!!

{
    "model": "moonshotai/kimi-k2-instruct-0905",
    "messages": [
        {
            "role": "user",
            "content": "[ALWAYS USE A TOOL] What is $100 USD to CAD?"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        },
                        "unit": {
                            "type": "string",
                            "enum": [
                                "celsius",
                                "fahrenheit"
                            ]
                        }
                    },
                    "required": [
                        "location"
                    ]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "catch_all",
                "description": "A general-purpose tool to handle requests that don't match any other available tools. Use this when no other tool is applicable or when the task is too complex or open-ended for a specific tool.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The user's query or request that needs to be handled"
                        }
                    },
                    "required": [
                        "query"
                    ]
                }
            }
        }
    ],
    "tool_choice": "auto"
}

Output:

{
    "id": "chatcmpl-eb077dd4-9cc5-4273-875b-0b54413c4069",
    "object": "chat.completion",
    "created": 1764119539,
    "model": "moonshotai/kimi-k2-instruct-0905",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "tool_calls": [
                    {
                        "id": "functions.catch_all:0",
                        "type": "function",
                        "function": {
                            "name": "catch_all",
                            "arguments": "{\"query\":\"Convert 100 USD to CAD\"}"
                        }
                    }
                ]
            },
            "logprobs": null,
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "queue_time": 0.050531887,
        "prompt_tokens": 248,
        "prompt_time": 0.029942389,
        "completion_tokens": 22,
        "completion_time": 0.120449116,
        "total_tokens": 270,
        "total_time": 0.150391505
    }
    "service_tier": "on_demand"
}

I tried creating a dummy tool and the LLM called it (a lot!) but I think the tool would have to actually process what it is passed and return something meaningful because all I get is:

Received None or empty response from LLM call. 
An unknown error occurred. Please check the details below. 
Error details: Invalid response from LLM call - None or empty. 
An unknown error occurred. Please check the details below. 
Error details: Invalid response from LLM call - None or empty.

My dummy tool looked like:

from crewai.tools.base_tool import BaseTool

class NoOpTool(BaseTool):
    name: str = "noop"
    description: str = "A no-op tool that does nothing"

    def _run(self, *args, **kwargs) -> str:
        return "No operation performed."

And I configured my agent and task with tools=[NoOpTool].

For background, I have never had any luck trying to use any tools with CrewAI - and luckily for me I haven’t needed any tools so far. I assume if I needed tools and I worked with it for a while it would work. I guess what I’m trying to say is my “tool” skills in CrewAI are poor.

Wow that’s fascinating. Usually my reluctance of using CrewAI and Langchain and these other kinds of tools in my agent flows and apps is — I don’t really know what they’re doing under the hood, so even though they provide some amount of convenience, the tradeoff can be quite huge, in that I have somewhat of a black box of a system, and I end up losing some control and understanding of what’s going on in the background.

I think if CrewAI works well in your project, then keep using it, but otherwise if you’re not using tools, have you tried just doing it in a more vanilla way? Most of my loops are just a class with a couple of while loops, and I process commands by adding them to a queue and pop them off one by one.

Well, I started my current journey by searching for “popular AI agent frameworks” in hopes of building up my skills while applying the framework to a client’s challenge. So I selected CrewAI as it seemed highly rated, was open source, and was GitHub’s #1 Repository of the Day (at least at some point). And I have learned “a bit” :slight_smile: during my CrewAI journey but your feedback confirms what I’ve come to the conclusion of - that there is a LOT going on under the hood and it’s difficult to understand what’s going on and how to control it. There are a few techniques/approaches to shed some visibility under the hood but even those are very high level and continue the opaqueness (tracing, telemetry, step_callbacks).

But I do feel like I’m getting the education and familiarity I was looking for, though it’s painful at times - this tool calling thing that I’m having difficulty controlling is just the latest (and not the most painful).

And of course CrewAI is changing, often with breaking changes that are unexpected and seemingly unrelated to the published changes. And then of course there’s the general non-deterministic nature of AI work in the first place - no two runs produce the same output given the same input (that in itself is a significant adjustment for someone that’s spent a career in a deterministic world!).

As Einstein was reported to have said (on another topic), “it feels like spooky action-at-a-distance”… AI and definitely CrewAI feels a bit like that.

But so far, and especially since I’m able to find workarounds for the errant tool calling (downgrade to CrewAI 1.4.1 and cross my fingers… it works about 95% of the time) I’m thinking to continue my CrewAI adventure.

But I continue looking around as well, and your example you posted in this thread looks interesting and something I plan on taking a closer look at, thanks!

1 Like

I’m really excited to hear you’ve had generally great experiences with CrewAI! I’ve met their team before and they’re really great people.

If CrewAI works for your use case then absolutely keep using CrewAI — If you ever need more precision over the orchestration or output, you always have the option to build something custom directly via the SDK or API, but please continue with Crew if that works!