On HMMT Feb 25, a rigorous reasoning benchmark, Qwen3-Max-Thinking scored 98.0, edging out Gemini 3 Pro (97.5) and ...
Alibaba says Qwen3-Max-Thinking is its ‘best model so far’, while Moonshot calls Kimi K2.5 the world’s most powerful ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results