Voice · best for
Best AI model for Transcription (2026)
Speech-to-text accuracy and speed. Ranked from 343 live models on the OpenRouter catalog, weighted for audio input, low latency.
| # | Model | Score | In / 1M | Out / 1M | Context | |
|---|---|---|---|---|---|---|
| 1 | Xiaomi: MiMo-V2-Omnixiaomi/mimo-v2-omni | 123 | $0.40 | $2.00 | 262,144 | Try → |
| 2 | Google: Gemini 3.1 Flash Lite Previewgoogle/gemini-3.1-flash-lite-preview | 123 | $0.25 | $1.50 | 1,048,576 | Try → |
| 3 | Google: Gemini 3 Flash Previewgoogle/gemini-3-flash-preview | 123 | $0.50 | $3.00 | 1,048,576 | Try → |
| 4 | Google: Gemini 2.5 Flash Lite Preview 09-2025google/gemini-2.5-flash-lite-preview-09-2025 | 123 | $0.10 | $0.40 | 1,048,576 | Try → |
| 5 | Google: Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite | 123 | $0.10 | $0.40 | 1,048,576 | Try → |
| 6 | Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash | 123 | $0.30 | $2.50 | 1,048,576 | Try → |
| 7 | Google: Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite-001 | 123 | $0.07 | $0.30 | 1,048,576 | Try → |
| 8 | Google: Gemini 2.0 Flashgoogle/gemini-2.0-flash-001 | 123 | $0.10 | $0.40 | 1,000,000 | Try → |
| 9 | Google: Gemini 3.1 Pro Preview Custom Toolsgoogle/gemini-3.1-pro-preview-customtools | 115 | $2.00 | $12.00 | 1,048,576 | Try → |
| 10 | Google: Gemini 3.1 Pro Previewgoogle/gemini-3.1-pro-preview | 115 | $2.00 | $12.00 | 1,048,576 | Try → |
| 11 | Google: Gemini 2.5 Progoogle/gemini-2.5-pro | 115 | $1.25 | $10.00 | 1,048,576 | Try → |
| 12 | Google: Gemini 2.5 Pro Preview 06-05google/gemini-2.5-pro-preview | 115 | $1.25 | $10.00 | 1,048,576 | Try → |
| 13 | Google: Gemini 2.5 Pro Preview 05-06google/gemini-2.5-pro-preview-05-06 | 115 | $1.25 | $10.00 | 1,048,576 | Try → |
| 14 | OpenAI: GPT Audio Miniopenai/gpt-audio-mini | 112 | $0.60 | $2.40 | 128,000 | Try → |
| 15 | Mistral: Voxtral Small 24B 2507mistralai/voxtral-small-24b-2507 | 101 | $0.10 | $0.30 | 32,000 | Try → |
How we ranked these
For Transcription, we weight models on audio input, low latency. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →