meta-llama

Meta: Llama 4 Scout

Meta: Llama 4 Scout is a multimodal model from Meta that accepts both text and image inputs and supports tool use. Its headline feature is a 10 million token context window, one of the largest available, making it suited for workflows that require holding large documents, codebases, or conversation histories in a single session. It does not support native reasoning mode, and structured output support is unconfirmed. Maximum output is capped at 16,384 tokens per response. At $0.10 per million input tokens and $0.30 per million output tokens, Llama 4 Scout sits at the budget end of the market, which is its clearest advantage. However, its blended benchmark score of 6.1 across only three benchmarks leaves its general capability largely unproven relative to more thoroughly evaluated alternatives. Teams with cost-sensitive, high-volume workloads involving long context or image inputs may find it worth trialing, but those prioritizing demonstrated performance should treat the benchmark picture as thin and weigh that gap carefully.

Quality Score
99/100
price + capability + benchmarks
Input Price
$0.10
per 1M tokens
Output Price
$0.30
per 1M tokens
Context Window
10,000,000
tokens
Model ID
meta-llama/llama-4-scout
Vendor
meta-llama
Tokenizer
Llama4
Input Modalities
text, image
Output Modalities
text
Max Output
16,384 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
not supported
Vision
✓ accepts images
Audio
no
Moderated
no

Category rankings

Where Meta: Llama 4 Scout places across the 3 categories it ranks in. How we rank →

#CategoryScore
#23 Code CompletionCode · of 25 ranked 132
#23 Cheap Bulk InferenceCost · of 25 ranked 137
#25 Self-Hosted / LocalCost · of 25 ranked 117

Similar models