GLM-4.7 Open-Source Model
Benchmark Evaluation 2025
Official Results
Coding Capability Breakthrough.
First open-source model to achieve state-of-the-art across all major coding benchmarks
95.7
AIME 2025
Mathematical Reasoning
GLM-4.7
95.7
Claude 3.5
88.2
GPT-4o
83.6
73.8%
SWE-bench Verified
Software Engineering
GLM-4.7
73.8%
Claude 3.5
53.3%
GPT-4o
48.2%
87.4
τ²-Bench
Agent Task Completion
GLM-4.7
87.4
Claude 3.5
78.9
GPT-4o
71.5
GLM-4.7 demonstrates that open-source models can compete at the frontier of coding intelligence, outperforming leading proprietary models with margins of +7.5 to +20.5 points across benchmarks.
ZHIPU AI