I would also like to add that Kimi K2 0905 is awesome & fast, however we get rampant tool call issues.
The tool definitions are all very precise and exact, but about 80% of the time it fails to call the tool and just hallucinates an answer.
Eg: get_date_and_time tool, prompt: “what’s the year?“
result: model doesnt call tool, just responds with “2025”. same thing happens for “what’s the day today?” - model just hallucinates a wrong date.
We’ve also had issues with the gpt-oss models calling tools that don’t exist. I believe this to be a platform issue rather than the individual models themselves. I think this is a large blocker for teams like mine migrating to Groq from OpenAI.