openai

OpenAI: GPT Audio Mini

GPT Audio Mini is OpenAI's lower-tier audio-capable model, accepting both text and audio as inputs and supporting tool use. It works with a 128,000-token context window and produces up to 16,384 completion tokens per response. It does not support reasoning modes, and structured output support is not confirmed. Pricing sits at $0.60 per million input tokens and $2.40 per million output tokens. For teams building voice-driven or audio-processing workflows on a tighter budget, this model's native audio input and tool support make it worth considering alongside pricier alternatives. The tradeoff is transparency: there is no independent benchmark coverage to evaluate its quality against competitors, so any shortlisting should be treated as tentative until you run task-specific evaluations of your own. Budget-conscious buyers get a potentially useful feature set, but go in knowing the performance profile is currently unproven.

Quality Score
89/100
price + capability + benchmarks
Input Price
$0.60
per 1M tokens
Output Price
$2.40
per 1M tokens
Context Window
128,000
tokens
Model ID
openai/gpt-audio-mini
Vendor
openai
Tokenizer
GPT
Input Modalities
text, audio
Output Modalities
text, audio
Max Output
16,384 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
not supported
Vision
text only
Audio
✓ accepts audio
Moderated
yes

Category rankings

Where OpenAI: GPT Audio Mini places across the 3 categories it ranks in. How we rank →

#CategoryScore
#17 TranscriptionVoice · of 19 ranked 112
#17 Audio SummarizationVoice · of 19 ranked 109
#17 TTS ReplacementVoice · of 19 ranked 104

Similar models