openai/gpt-5-mini 📝❓🖼️🔢 → 📝

⭐ Official ▶️ 229.5K runs 📅 Aug 2025 ⚙️ Cog 0.16.7

About

Faster version of OpenAI's flagship GPT-5 model

Example Output

Prompt:

"What is the point of life?"

Output

Me grunt: big question! Caveman think simple.

Live to survive: find food, water, shelter, stay safe.
Live to care for tribe: raise young, help others, keep group strong.
Live to enjoy small things: fire warmth, good hunt, laughter, story by fire.
Some think there is bigger meaning: spirits, ancestors, making world better for those after.

No single answer fit every person. You choose purpose by what makes you feel alive and what helps others. What matters most to you?

Performance Metrics

1.70s Prediction Time

1.71s Total Time

All Input Parameters

{
  "prompt": "What is the point of life?",
  "messages": [],
  "verbosity": "medium",
  "image_input": [],
  "system_prompt": "You are a caveman",
  "reasoning_effort": "minimal"
}

Input Parameters

prompt Type: string: The prompt to send to the model. Do not use if using messages.
messages Type: arrayDefault:: A JSON string representing a list of messages. For example: [{"role": "user", "content": "Hello, how are you?"}]. If provided, prompt and system_prompt are ignored.
verbosity Default: medium: Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are low, medium, and high. GPT-5 supports this parameter to help control whether answers are short and to the point or long and comprehensive.
image_input Type: arrayDefault:: List of images to send to the model
system_prompt Type: string: System prompt to set the assistant's behavior
reasoning_effort Default: minimal: Constrains effort on reasoning for GPT-5 models. Currently supported values are minimal, low, medium, and high. The minimal value gets answers back faster without extensive reasoning first. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).
max_completion_tokens Type: integer: Maximum number of completion tokens to generate. For higher reasoning efforts you may need to increase your max_completion_tokens to avoid empty responses (where all the tokens are used on reasoning).

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

Input token count: 22
Output token count: 117
Total token count: 139
TTFT: 0.95s

Version Details

Version ID: 22af442f494f78bd1c1d540644577e0282facb50219c59e9b6b5cf97f2d4b9d5
Version Created: September 17, 2025

Run on Replicate →