New Model Family: Devstral 2

Devstral 2 (123B) under a modified MIT license, and Devstral Small (24B) under Apache 2.0.

These models are scoring very well for their size. And they are currently outpacing GPT-OSS-120b in coding. I would like to see these models adopted.

I’ve been playing with Devstral on my local machine and it seems to be adequate but not really mind-blowing — have you seen it outperform in any way? It’s great for a small 120B size (and I think it vibes a bit better than 120b) but I think still falls short of the larger models like Kimi K2?