Model benchmarks are the scoreboard of the AI race. Here is how we use them — and what they reveal about who is winning right now.
Model benchmarks are the most objective measure of where each company stands in the AI race. They account for 30% of the SEVENAI Momentum Index — the highest-weighted dimension we track. But raw scores only tell half the story. We score each company on both absolute performance and week-over-week improvement, because in a race, momentum predicts the future better than current position.
"A company scoring 85% on MMLU and improving by 2 points weekly is more interesting than one scoring 91% and standing still."
— SEVENAI Methodology Notes, May 2026The four benchmarks we track
We track four evaluations chosen because they are hard to game, widely respected, and measure capabilities with direct commercial value.
Where each company stands — May 17, 2026
Benchmark component scores out of a maximum 30 points.
- Nvidia29.1▲ +0.4
- Meta27.0▲ +1.8
- Microsoft27.0▲ +0.6
- Alphabet25.8— 0.0
- Tesla21.6▲ +0.3
- Amazon20.4— 0.0
- Apple15.6▼ −0.6
The headline this week: Meta's Llama 5 HumanEval results have driven the largest single-week benchmark gain in our index. Apple continues to slide — its on-device model constraint creates a structural ceiling no engineering can fully overcome. Alphabet is flat but a strong Gemini Ultra release could close the gap with Microsoft quickly.
Next week: we publish the methodology for Dimension 2 — AI Capital Expenditure at 25% of the total score. It is the best leading indicator of competitive position six to twelve months from now.
Comments
Post a Comment