Zhipu GLM-5.2 Tops DeepSWE Open-Source Benchmark With 44% Success Rate, Outperforms Mainstream Closed-Source Models

According to Beating (a monitoring account), Zhipu AI's open-source model GLM-5.2 achieved the highest success rate among open-source models on the DeepSWE benchmark for complex software engineering tasks, with a 44% one-shot success rate at maximum reasoning intensity. This outperforms Kimi K2.7 Code's 31% by 13 percentage points.

At $3.92 per task, GLM-5.2 exceeds the performance of several mainstream closed-source models under specific reasoning configurations, including Claude Sonnet 4.6 [high] at 30%, Gemini 3.5 Flash [medium] at 37%, and Claude Opus 4.8 [low] at 41%.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments