If the 100B model aka Super is performing better as well, then it’s interesting indeed. @yawnxyz I assume the issue on Groq’s end is the LPU memory? Means you can’t serve a lot of different models because you need to allocate a lot of LPUs per model?