Hi,
i love to be able to use kimi-k2 on groq. the speed is fantastic!
I am using kimi-k2 via opencode and it works fine from functional perspective. also the speed is considerably better than direct calls to moonshot.ai.
But there is a huge problem i face. When i started to use it on groq i was surprised about the costs but haven’t had the chance to check. I was assuming, that there is something wrong with the caching and didn’t give it a second look.
I recently found the “activity” tab on the usage page of groq’s dashboard and now i can see my main issue.
While doing a rather small change on my project - the cached token count exploded (non-cached tokens ~960k vs. cached tokens ~10.3Mio)
compared with using kimi-k2 on moonshot.ai i see a completely different token usage. Working with moonshot’s api, i use far less tokens even if i do way bigger changes.
This might be an opencode issue - but I wanted to ask if somebody had the same issue and might have a tip for me. I do use the same config/settings on opencode for moonshot.ai and groq - so my immediate thought was that the caching on groq might do something wrong tbh