prunaai/gpt-oss-20b-fast 🔢📝 → 📝

▶️ 827 runs 📅 Aug 2025 ⚙️ Cog 0.16.9 🔗 GitHub ⚖️ License
code-generation question-answering text-generation

About

Advanced 20B open-weight reasoning models to customize for any use case and run anywhere.

Example Output

Output

There are 3 letter “r”’s in the word strawberry.

Performance Metrics

0.35s Prediction Time
0.36s Total Time
All Input Parameters
{
  "top_p": 0.95,
  "message": "How many rs are in the word 'strawberry'?",
  "max_tokens": 2048,
  "temperature": 0.7
}
Input Parameters
top_p Type: numberDefault: 0.95Range: 0 - 1
Nucleus sampling: only consider tokens with cumulative probability up to this value
message Type: stringDefault: Explain vLLM in one sentence
The user message to send to the model
max_tokens Type: integerDefault: 2048Range: 1 - 16384
Maximum number of tokens to generate
temperature Type: numberDefault: 0.7Range: 0 - 2
Sampling temperature (higher = more creative, lower = more deterministic)
system_prompt Type: string
Optional system prompt to set the model's behavior
Output Schema

Output

Type: arrayItems Type: string

Version Details
Version ID
04455067a777d85ee636cd30f6b6547075a62c0f5f20d2f00e58d459a30e80d3
Version Created
February 20, 2026
Run on Replicate →