Latency · best for

Best AI model for Real-Time Chat (2026)

Models tuned for sub-second response. Ranked from 346 live models on the OpenRouter catalog, weighted for low latency, low cost.

#	Model	Score	In / 1M	Out / 1M	Context
1	Google: Gemma 4 26B A4B (free)google/gemma-4-26b-a4b-it:free	118	Free	Free	262,144	Try →
2	Google: Gemma 4 26B A4B google/gemma-4-26b-a4b-it	118	$0.07	$0.34	262,144	Try →
3	Google: Gemma 4 31B (free)google/gemma-4-31b-it:free	118	Free	Free	262,144	Try →
4	Qwen: Qwen3.5-9Bqwen/qwen3.5-9b	118	$0.10	$0.15	262,144	Try →
5	ByteDance Seed: Seed-2.0-Minibytedance-seed/seed-2.0-mini	118	$0.10	$0.40	262,144	Try →
6	Qwen: Qwen3.5-Flashqwen/qwen3.5-flash-02-23	118	$0.07	$0.26	1,000,000	Try →
7	ByteDance Seed: Seed 1.6 Flashbytedance-seed/seed-1.6-flash	118	$0.07	$0.30	262,144	Try →
8	Google: Gemini 2.5 Flash Lite Preview 09-2025google/gemini-2.5-flash-lite-preview-09-2025	118	$0.10	$0.40	1,048,576	Try →
9	OpenAI: GPT-5 Nanoopenai/gpt-5-nano	118	$0.05	$0.40	400,000	Try →
10	Google: Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite	118	$0.10	$0.40	1,048,576	Try →
11	OpenAI: GPT-4.1 Nanoopenai/gpt-4.1-nano	118	$0.10	$0.40	1,047,576	Try →
12	Google: Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite-001	118	$0.07	$0.30	1,048,576	Try →
13	Google: Gemini 2.0 Flashgoogle/gemini-2.0-flash-001	118	$0.10	$0.40	1,048,576	Try →
14	Google: Gemma 4 31Bgoogle/gemma-4-31b-it	117	$0.13	$0.38	262,144	Try →
15	OpenAI: GPT-5.4 Nanoopenai/gpt-5.4-nano	117	$0.20	$1.25	400,000	Try →

How we ranked these

For Real-Time Chat, we weight models on low latency, low cost. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →