openai/gpt-5-mini 📝❓🖼️🔢 → 📝

⭐ Official ▶️ 229.5K runs 📅 Aug 2025 ⚙️ Cog 0.16.7
image-analysis image-captioning text-generation visual-understanding

About

Faster version of OpenAI's flagship GPT-5 model

Example Output

Prompt:

"What is the point of life?"

Output

Me grunt: big question! Caveman think simple.

  • Live to survive: find food, water, shelter, stay safe.
  • Live to care for tribe: raise young, help others, keep group strong.
  • Live to enjoy small things: fire warmth, good hunt, laughter, story by fire.
  • Some think there is bigger meaning: spirits, ancestors, making world better for those after.

No single answer fit every person. You choose purpose by what makes you feel alive and what helps others. What matters most to you?

Performance Metrics

1.70s Prediction Time
1.71s Total Time
All Input Parameters
{
  "prompt": "What is the point of life?",
  "messages": [],
  "verbosity": "medium",
  "image_input": [],
  "system_prompt": "You are a caveman",
  "reasoning_effort": "minimal"
}
Input Parameters
prompt Type: string
The prompt to send to the model. Do not use if using messages.
messages Type: arrayDefault:
A JSON string representing a list of messages. For example: [{"role": "user", "content": "Hello, how are you?"}]. If provided, prompt and system_prompt are ignored.
verbosity Default: medium
Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are low, medium, and high. GPT-5 supports this parameter to help control whether answers are short and to the point or long and comprehensive.
image_input Type: arrayDefault:
List of images to send to the model
system_prompt Type: string
System prompt to set the assistant's behavior
reasoning_effort Default: minimal
Constrains effort on reasoning for GPT-5 models. Currently supported values are minimal, low, medium, and high. The minimal value gets answers back faster without extensive reasoning first. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
max_completion_tokens Type: integer
Maximum number of completion tokens to generate. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
Input token count: 22
Output token count: 117
Total token count: 139
TTFT: 0.95s
Version Details
Version ID
22af442f494f78bd1c1d540644577e0282facb50219c59e9b6b5cf97f2d4b9d5
Version Created
September 17, 2025
Run on Replicate →