Cost · best for

Top picks for Self-Hosted / Local (2026)

Open-weights models you can run yourself. Ranked from 352 live models on the OpenRouter catalog, weighted for low cost.

What this is A capability-matched shortlist, not a benchmark-tested winner. Models are scored by the fit of their declared specs (structured output, reasoning, context, modality, price) against Self-Hosted / Local. Pair with benchmark sources like Artificial Analysis or LMSys Arena before you ship. Full methodology →
#ModelScoreIn / 1MOut / 1MContext
1 Google: Gemma 4 26B A4B (free)google/gemma-4-26b-a4b-it:free 118 Free Free 262,144 Details →
2 Google: Gemma 4 31B (free)google/gemma-4-31b-it:free 118 Free Free 262,144 Details →
3 Google: Gemma 4 26B A4B google/gemma-4-26b-a4b-it 117 $0.06 $0.33 262,144 Details →
4 Qwen: Qwen3.5-9Bqwen/qwen3.5-9b 117 $0.10 $0.15 262,144 Details →
5 Qwen: Qwen3.5-Flashqwen/qwen3.5-flash-02-23 117 $0.07 $0.26 1,000,000 Details →
6 ByteDance Seed: Seed 1.6 Flashbytedance-seed/seed-1.6-flash 117 $0.07 $0.30 262,144 Details →
7 OpenAI: GPT-5 Nanoopenai/gpt-5-nano 117 $0.05 $0.40 400,000 Details →
8 Google: Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite-001 117 $0.07 $0.30 1,048,576 Details →
9 Google: Gemma 4 31Bgoogle/gemma-4-31b-it 117 $0.13 $0.38 262,144 Details →
10 Mistral: Mistral Small 4mistralai/mistral-small-2603 117 $0.15 $0.60 262,144 Details →
11 ByteDance Seed: Seed-2.0-Minibytedance-seed/seed-2.0-mini 117 $0.10 $0.40 262,144 Details →
12 xAI: Grok 4.1 Fastx-ai/grok-4.1-fast 117 $0.20 $0.50 2,000,000 Details →
13 Google: Gemini 2.5 Flash Lite Preview 09-2025google/gemini-2.5-flash-lite-preview-09-2025 117 $0.10 $0.40 1,048,576 Details →
14 xAI: Grok 4 Fastx-ai/grok-4-fast 117 $0.20 $0.50 2,000,000 Details →
15 Google: Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite 117 $0.10 $0.40 1,048,576 Details →

How we ranked these

For Self-Hosted / Local, we weight models on low cost. Higher means better. Scores combine each model's public metadata (context length, modality support, tool calling, structured output, reasoning capability) with live pricing. See full methodology →

Related tasks