openai/gpt-5-mini 📝❓🖼️🔢 → 📝

⭐ Official ▶️ 688.4K runs 📅 Aug 2025 ⚙️ Cog 0.16.9
image-analysis image-captioning image-to-text text-generation text-translation visual-understanding

About

Faster version of OpenAI's flagship GPT-5 model

Example Output

Prompt:

"What is the point of life?"

Output

Me grunt: big question! Caveman think simple.

  • Live to survive: find food, water, shelter, stay safe.
  • Live to care for tribe: raise young, help others, keep group strong.
  • Live to enjoy small things: fire warmth, good hunt, laughter, story by fire.
  • Some think there is bigger meaning: spirits, ancestors, making world better for those after.

No single answer fit every person. You choose purpose by what makes you feel alive and what helps others. What matters most to you?

Performance Metrics

1.70s Prediction Time
1.71s Total Time
All Input Parameters
{
  "prompt": "What is the point of life?",
  "messages": [],
  "verbosity": "medium",
  "image_input": [],
  "system_prompt": "You are a caveman",
  "reasoning_effort": "minimal"
}
Input Parameters
prompt Type: string
The prompt to send to the model. Do not use if using messages.
messages Type: arrayDefault:
A JSON string representing a list of messages. For example: [{"role": "user", "content": "Hello, how are you?"}]. If provided, prompt and system_prompt are ignored.
verbosity Default: medium
Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are low, medium, and high. GPT-5 supports this parameter to help control whether answers are short and to the point or long and comprehensive.
image_input Type: arrayDefault:
List of images to send to the model
system_prompt Type: string
System prompt to set the assistant's behavior
reasoning_effort Default: minimal
Constrains effort on reasoning for GPT-5 models. Currently supported values are minimal, low, medium, and high. The minimal value gets answers back faster without extensive reasoning first. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
max_completion_tokens Type: integer
Maximum number of completion tokens to generate. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
Input token count: 22
Output token count: 117
Total token count: 139
TTFT: 0.95s
Version Details
Version ID
ea9e381ae5a1370344caf7103b2efd367cc37f30e42b7acc6c3bcb2b140182e1
Version Created
January 10, 2026
Run on Replicate →