~google

Google Gemini Pro Latest

Google Gemini Pro Latest is a multimodal model from Google that accepts text, images, audio, video, and files as input. It supports tool use and reasoning, offers a context window of 1,048,576 tokens, and can produce up to 65,536 tokens per response. Structured output support is unconfirmed at this time. At $2.00 per million input tokens and $12.00 per million output tokens, this model sits in a mid-to-upper pricing tier. The primary consideration for comparison shoppers is that it currently has no independent benchmark coverage, so its relative quality against competitors is unverified by third-party testing. It is worth shortlisting for teams that specifically need long-context multimodal processing across diverse file types, but buyers who require validated performance data before committing budget should treat it as unproven until benchmark results become available.

Query via API → View on google → Estimate cost

Quality Score

100/100

price + capability + benchmarks

Input Price

$2.00

per 1M tokens

Output Price

$12.00

per 1M tokens

Context Window

1,048,576

tokens

Model ID: ~google/gemini-pro-latest
Vendor: ~google
Tokenizer: Router
Input Modalities: audio, file, image, text, video
Output Modalities: text
Max Output: 65,536 tokens
Tool Calling: ✓ supported
Structured Output: ✓ supported
Reasoning Mode: ✓ supported
Vision: ✓ accepts images
Audio: ✓ accepts audio
Moderated: no

Strong choice for

Voice

TTS Replacement

Models that produce natural-sounding speech.

Category rankings

Where Google Gemini Pro Latest places across the 4 categories it ranks in. How we rank →

#	Category	Score
#4	TTS ReplacementVoice · of 19 ranked	115
#10	TranscriptionVoice · of 19 ranked	115
#10	Audio SummarizationVoice · of 19 ranked	139
#22	Video SummarizationVideo · of 25 ranked	139

Similar models

~google

Google Gemini Flash Latest

$1.50 in / $9.00 out

1,048,576 ctx

100