Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)
Google's Nano Banana 2 is a multimodal model from Google that accepts both text and image inputs and returns text outputs. It supports reasoning and offers a 131,072-token context window with up to 32,768 completion tokens, giving it reasonable headroom for longer tasks. It does not support tool use, and structured output support is unconfirmed. At $0.50 per million input tokens and $3.00 per million output tokens, it sits in the budget-to-mid range for image-capable models, making it worth considering for teams that need vision plus reasoning without paying premium rates. The significant caveat is that it carries no independent benchmark coverage, so its actual performance relative to competitors is unproven. Buyers who need verified quality scores before committing should treat it as an unknown quantity and run their own evaluations before deploying it in production.
- Model ID
- google/gemini-3.1-flash-image-preview
- Vendor
- Tokenizer
- Gemini
- Input Modalities
- image, text
- Output Modalities
- image, text
- Max Output
- 32,768 tokens
- Tool Calling
- not supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no
Strong choice for
Category rankings
Where Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) places across the 1 category it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #5 | Image GenerationVision · of 8 ranked | 99 |