Google: Gemma 3 27B
Gemma 3 27B is a multimodal model from Google that accepts both text and image inputs, supports tool use, and offers a 131K-token context window with up to 16K tokens of output per response. It does not include a built-in reasoning mode, and structured output support is unconfirmed. The combination of vision input and tool support makes it suitable for agentic workflows that involve image understanding alongside function calling. At $0.08 per million input tokens and $0.16 per million output tokens, this is a low-cost option worth considering for budget-sensitive applications. However, its benchmark picture is thin: a blended score of 19.9 across only three benchmarks leaves its real-world capability largely unproven relative to more thoroughly evaluated competitors. Buyers who need confirmed broad performance should treat those numbers cautiously and run their own evaluations before committing.
- Model ID
- google/gemma-3-27b-it
- Vendor
- Tokenizer
- Gemini
- Input Modalities
- text, image
- Output Modalities
- text
- Max Output
- 16,384 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- not supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no