Latency · best for

Best AI model for Real-Time Chat (2026)

Models tuned for sub-second response. Ranked from 346 live models on the OpenRouter catalog, weighted for low latency, low cost.

#ModelScoreIn / 1MOut / 1MContext
1 Google: Gemma 4 26B A4B (free)google/gemma-4-26b-a4b-it:free 118 Free Free 262,144 Try →
2 Google: Gemma 4 26B A4B google/gemma-4-26b-a4b-it 118 $0.07 $0.34 262,144 Try →
3 Google: Gemma 4 31B (free)google/gemma-4-31b-it:free 118 Free Free 262,144 Try →
4 Qwen: Qwen3.5-9Bqwen/qwen3.5-9b 118 $0.10 $0.15 262,144 Try →
5 ByteDance Seed: Seed-2.0-Minibytedance-seed/seed-2.0-mini 118 $0.10 $0.40 262,144 Try →
6 Qwen: Qwen3.5-Flashqwen/qwen3.5-flash-02-23 118 $0.07 $0.26 1,000,000 Try →
7 ByteDance Seed: Seed 1.6 Flashbytedance-seed/seed-1.6-flash 118 $0.07 $0.30 262,144 Try →
8 Google: Gemini 2.5 Flash Lite Preview 09-2025google/gemini-2.5-flash-lite-preview-09-2025 118 $0.10 $0.40 1,048,576 Try →
9 OpenAI: GPT-5 Nanoopenai/gpt-5-nano 118 $0.05 $0.40 400,000 Try →
10 Google: Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite 118 $0.10 $0.40 1,048,576 Try →
11 OpenAI: GPT-4.1 Nanoopenai/gpt-4.1-nano 118 $0.10 $0.40 1,047,576 Try →
12 Google: Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite-001 118 $0.07 $0.30 1,048,576 Try →
13 Google: Gemini 2.0 Flashgoogle/gemini-2.0-flash-001 118 $0.10 $0.40 1,048,576 Try →
14 Google: Gemma 4 31Bgoogle/gemma-4-31b-it 117 $0.13 $0.38 262,144 Try →
15 OpenAI: GPT-5.4 Nanoopenai/gpt-5.4-nano 117 $0.20 $1.25 400,000 Try →

How we ranked these

For Real-Time Chat, we weight models on low latency, low cost. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →