Last week I was experimenting with using Kimi2 and the results were looking great. This week, I noticed that periodically Kimi2 would start to error in a way I had not seen before: the model would take much longer to respond (~60s vs ~10s previously) and when it returned the LLM would be repeating the same set of tokens over and over, often just one token repeated for the entire output.Â
My question is what may have changed to cause this to happen and also how can I configure Kimi2 to prevent this from happening. Any help is much appreciated.Â