value rankings

Intelligence Per Dollar

Benchmark points per blended dollar across every credibly ranked model. Scores are min-max normalized across 18 models with 4+ independent benchmarks; cost assumes a typical 3:1 input:output token mix. Leader = 100.

This is not a quality ranking. #1 here means the most benchmark points per dollar, so a mid-scoring budget model will outrank a frontier model that costs 100x more. Check the Score and % of top score columns for raw capability, and use the task rankings when output quality is what compounds in your workflow.

Value leaderboard

#ModelScore% of top scoreIn / Out per 1MBlended $/1MValue
1DeepSeek: DeepSeek V4 Flash BEST VALUEdeepseek/deepseek-v4-flash75.179%$0.09 / $0.18$0.11100.0
2OpenAI: gpt-oss-120bopenai/gpt-oss-120b42.344%$0.04 / $0.18$0.0785.3
3OpenAI: gpt-oss-20bopenai/gpt-oss-20b23.825%$0.03 / $0.14$0.0662.8
4Google: Gemma 4 26B A4B google/gemma-4-26b-a4b-it51.053%$0.06 / $0.33$0.1359.9
5Google: Gemma 4 31Bgoogle/gemma-4-31b-it56.459%$0.12 / $0.35$0.1847.6
6DeepSeek: DeepSeek V3deepseek/deepseek-chat79.183%$0.20 / $0.80$0.3533.8
7DeepSeek: DeepSeek V4 Prodeepseek/deepseek-v4-pro82.586%$0.43 / $0.87$0.5422.7
8Qwen: Qwen3.5 397B A17Bqwen/qwen3.5-397b-a17b63.166%$0.39 / $2.45$0.9010.5
9MoonshotAI: Kimi K2.6moonshotai/kimi-k2.677.481%$0.66 / $3.50$1.378.5
10Meta: Llama 4 Maverickmeta-llama/llama-4-maverick13.915%$0.15 / $0.60$0.267.9
11Z.ai: GLM 5.2z-ai/glm-5.289.794%$1.00 / $4.00$1.757.7
12Z.ai: GLM 5.1z-ai/glm-5.175.379%$0.98 / $3.08$1.507.5
13Google: Gemini 3.1 Pro Previewgoogle/gemini-3.1-pro-preview76.580%$2.00 / $12.00$4.502.5
14OpenAI: GPT-5.4openai/gpt-5.490.495%$2.50 / $15.00$5.622.4
15Anthropic: Claude Sonnet 4.6anthropic/claude-sonnet-4.682.286%$3.00 / $15.00$6.002.1
16Anthropic: Claude Opus 4.8anthropic/claude-opus-4.895.5100%$5.00 / $25.00$10.001.4
17Anthropic: Claude Opus 4.7anthropic/claude-opus-4.789.794%$5.00 / $25.00$10.001.3
18OpenAI: GPT-5.5openai/gpt-5.593.798%$5.00 / $30.00$11.251.2

Budget champions : 80+ score, cheapest first

#ModelScore% of top scoreIn / Out per 1MBlended $/1MValue
1DeepSeek: DeepSeek V4 Prodeepseek/deepseek-v4-pro82.586%$0.43 / $0.87$0.5422.7
2Z.ai: GLM 5.2z-ai/glm-5.289.794%$1.00 / $4.00$1.757.7
3OpenAI: GPT-5.4openai/gpt-5.490.495%$2.50 / $15.00$5.622.4
4Anthropic: Claude Sonnet 4.6anthropic/claude-sonnet-4.682.286%$3.00 / $15.00$6.002.1
5Anthropic: Claude Opus 4.8anthropic/claude-opus-4.895.5100%$5.00 / $25.00$10.001.4
6Anthropic: Claude Opus 4.7anthropic/claude-opus-4.789.794%$5.00 / $25.00$10.001.3
7OpenAI: GPT-5.5openai/gpt-5.593.798%$5.00 / $30.00$11.251.2

Assumptions

Value = blended benchmark score divided by blended price per million tokens, indexed to the leader. A 3:1 input:output ratio fits most chat and RAG workloads; estimate your exact mix with the cost calculator. Scoring details in the methodology. Models with fewer than 4 independent benchmarks are excluded rather than guessed.