Skip to main content

Last week I was experimenting with using Kimi2 and the results were looking great. This week, I noticed that periodically Kimi2 would start to error in a way I had not seen before: the model would take much longer to respond (~60s vs ~10s previously) and when it returned the LLM would be repeating the same set of tokens over and over, often just one token repeated for the entire output. 

My question is what may have changed to cause this to happen and also how can I configure Kimi2 to prevent this from happening. Any help is much appreciated. 

Hi! I’m having trouble reproducing these results; could you please provider your system/user prompt? Are you getting these problems in Playground too?
 

If these are sensitive prompts you can also email me jzheng@groq.com


Reply