meta/llama-4-scout-instruct 🔢📝 → 📝

⭐ Official ▶️ 3.2M runs 📅 Apr 2025 ⚙️ Cog 0.13.8-dev+g1050bdd3 ⚖️ License

About

A 17 billion parameter model with 16 experts

Prompt:

"Hello, Llama!"

Hello! It's nice to meet you. I'm Llama, a large language model developed by Meta. How can I assist you today?

0.55s Prediction Time

0.56s Total Time

All Input Parameters

{
  "top_p": 1,
  "prompt": "Hello, Llama!",
  "max_tokens": 1024,
  "temperature": 0.6,
  "presence_penalty": 0,
  "frequency_penalty": 0
}

Input Parameters

top_p Type: numberDefault: 1: Top-p (nucleus) sampling
prompt Type: stringDefault:: Prompt
max_tokens Type: integerDefault: 1024Range: 2 - 20480: The maximum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6: The value used to modulate the next token probabilities.
presence_penalty Type: numberDefault: 0: Presence penalty
frequency_penalty Type: numberDefault: 0: Frequency penalty

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

Prompt: Hello, Llama!
Input token count: 5
Output token count: 29
TTFT: 0.23s
Tokens per second: 53.68
Total time: 0.54s

Version Details

Version ID: 8137b3975c483beadba5dbb424eb6824248cdc8a6746abbcfafb96886e82965f
Version Created: April 5, 2025