openai/gpt-5-mini 📝❓🖼️🔢 → 📝
About
Faster version of OpenAI's flagship GPT-5 model

Example Output
Prompt:
"What is the point of life?"
Output
Me grunt: big question! Caveman think simple.
- Live to survive: find food, water, shelter, stay safe.
- Live to care for tribe: raise young, help others, keep group strong.
- Live to enjoy small things: fire warmth, good hunt, laughter, story by fire.
- Some think there is bigger meaning: spirits, ancestors, making world better for those after.
No single answer fit every person. You choose purpose by what makes you feel alive and what helps others. What matters most to you?
Performance Metrics
1.70s
Prediction Time
1.71s
Total Time
All Input Parameters
{ "prompt": "What is the point of life?", "messages": [], "verbosity": "medium", "image_input": [], "system_prompt": "You are a caveman", "reasoning_effort": "minimal" }
Input Parameters
- prompt
- The prompt to send to the model. Do not use if using messages.
- messages
- A JSON string representing a list of messages. For example: [{"role": "user", "content": "Hello, how are you?"}]. If provided, prompt and system_prompt are ignored.
- verbosity
- Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are low, medium, and high. GPT-5 supports this parameter to help control whether answers are short and to the point or long and comprehensive.
- image_input
- List of images to send to the model
- system_prompt
- System prompt to set the assistant's behavior
- reasoning_effort
- Constrains effort on reasoning for GPT-5 models. Currently supported values are minimal, low, medium, and high. The minimal value gets answers back faster without extensive reasoning first. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
- max_completion_tokens
- Maximum number of completion tokens to generate. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
Output Schema
Output
Example Execution Logs
Input token count: 22 Output token count: 117 Total token count: 139 TTFT: 0.95s
Version Details
- Version ID
22af442f494f78bd1c1d540644577e0282facb50219c59e9b6b5cf97f2d4b9d5
- Version Created
- September 17, 2025