bytedance

ByteDance: UI-TARS 7B

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

Quality Score
75/100
composite of price, context, capability
Input Price
$0.10
per 1M tokens
Output Price
$0.20
per 1M tokens
Context Window
128,000
tokens
Model ID
bytedance/ui-tars-1.5-7b
Vendor
bytedance
Tokenizer
Other
Input Modalities
image, text
Output Modalities
text
Max Output
2,048 tokens
Tool Calling
not supported
Structured Output
not supported
Reasoning Mode
not supported
Vision
✓ accepts images
Audio
no
Moderated
no