openai

OpenAI: GPT Audio Mini

GPT Audio Mini is OpenAI's lower-tier audio-capable model, accepting both text and audio as inputs and supporting tool use. It works with a 128,000-token context window and produces up to 16,384 completion tokens per response. It does not support reasoning modes, and structured output support is not confirmed. Pricing sits at $0.60 per million input tokens and $2.40 per million output tokens. For teams building voice-driven or audio-processing workflows on a tighter budget, this model's native audio input and tool support make it worth considering alongside pricier alternatives. The tradeoff is transparency: there is no independent benchmark coverage to evaluate its quality against competitors, so any shortlisting should be treated as tentative until you run task-specific evaluations of your own. Budget-conscious buyers get a potentially useful feature set, but go in knowing the performance profile is currently unproven.

Query via API → View on openai → Estimate cost

Quality Score

89/100

price + capability + benchmarks

Input Price

$0.60

per 1M tokens

Output Price

$2.40

per 1M tokens

Context Window

128,000

tokens

Model ID: openai/gpt-audio-mini
Vendor: openai
Tokenizer: GPT
Input Modalities: text, audio
Output Modalities: text, audio
Max Output: 16,384 tokens
Tool Calling: ✓ supported
Structured Output: ✓ supported
Reasoning Mode: not supported
Vision: text only
Audio: ✓ accepts audio
Moderated: yes

Category rankings

Where OpenAI: GPT Audio Mini places across the 3 categories it ranks in. How we rank →

#	Category	Score
#17	TranscriptionVoice · of 19 ranked	112
#17	Audio SummarizationVoice · of 19 ranked	109
#17	TTS ReplacementVoice · of 19 ranked	104

Similar models

openai

OpenAI: GPT Audio Mini

Category rankings

Similar models

OpenAI: GPT-4o (2024-11-20)

OpenAI: GPT-4o (2024-08-06)

OpenAI: GPT-4o

OpenAI: GPT-5 Image

OpenAI: GPT-5.1 Chat

OpenAI: GPT-5.4 Image 2