IFQ · DESIGN
V2 · 2026
DATA → benchmarks.json

  
Issue № 05 · AI Benchmarks · Q2 2026
FRONTIER REPORT

The Age of
benchmarks.

Five frontier models, five numbers, one uncomfortable truth.

Leader · Q2
Claude 4.7
Sonnet, 1M ctx · Anthropic
SWE-bench
77.2%
coding, verified split
GPQA
84.5
diamond, graduate science
Price · input
$3/M
per million tokens, typical
Claude 4.7 Sonnet
77.2
GPT-5 Turbo
74.8
Gemini 3 Pro
71.3
GLM-5
68.9
Kimi k3
66.4
benchmarks
SOURCE SERIF 4 · ITALIC · OLDSTYLE FIGURES
ifq·design
D A T A · T Y P O G R A P H Y