Google Gemini Pro Latest
Google Gemini Pro Latest is a multimodal model from Google that accepts text, images, audio, video, and files as input. It supports tool use and reasoning, offers a context window of 1,048,576 tokens, and can produce up to 65,536 tokens per response. Structured output support is unconfirmed at this time. At $2.00 per million input tokens and $12.00 per million output tokens, this model sits in a mid-to-upper pricing tier. The primary consideration for comparison shoppers is that it currently has no independent benchmark coverage, so its relative quality against competitors is unverified by third-party testing. It is worth shortlisting for teams that specifically need long-context multimodal processing across diverse file types, but buyers who require validated performance data before committing budget should treat it as unproven until benchmark results become available.
- Model ID
- ~google/gemini-pro-latest
- Vendor
- Tokenizer
- Router
- Input Modalities
- audio, file, image, text, video
- Output Modalities
- text
- Max Output
- 65,536 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- ✓ accepts audio
- Moderated
- no
Strong choice for
Category rankings
Where Google Gemini Pro Latest places across the 4 categories it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #4 | TTS ReplacementVoice · of 19 ranked | 115 |
| #10 | TranscriptionVoice · of 19 ranked | 115 |
| #10 | Audio SummarizationVoice · of 19 ranked | 139 |
| #22 | Video SummarizationVideo · of 25 ranked | 139 |