meta/llama-4-maverick-instruct 🔢📝 → 📝

⭐ Official ▶️ 1.4M runs 📅 Apr 2025 ⚙️ Cog 0.13.8-dev+g1050bdd3 ⚖️ License
code-generation question-answering text-generation text-translation

About

A 17 billion parameter model with 128 experts

Example Output

Prompt:

"Hello, Llama!"

Output

Hello! It's nice to meet you. Is there something I can help you with or would you like to chat?

Performance Metrics

0.59s Prediction Time
0.60s Total Time
All Input Parameters
{
  "top_p": 1,
  "prompt": "Hello, Llama!",
  "max_tokens": 1024,
  "temperature": 0.6,
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
top_p Type: numberDefault: 1
Top-p (nucleus) sampling
prompt Type: stringDefault:
Prompt
max_tokens Type: integerDefault: 1024Range: 2 - 20480
The maximum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
Prompt: Hello, Llama!
Input token count: 5
Output token count: 24
TTFT: 0.39s
Tokens per second: 40.51
Total time: 0.59s
Version Details
Version ID
25bdfe11f52b557ade65599fad30b1a6f6d87ede91043974bbd320d6a4c1c841
Version Created
April 5, 2025
Run on Replicate →