meta/llama-4-scout-instruct 🔢📝 → 📝
About
A 17 billion parameter model with 16 experts

Example Output
Prompt:
"Hello, Llama!"
Output
Hello! It's nice to meet you. I'm Llama, a large language model developed by Meta. How can I assist you today?
Performance Metrics
0.55s
Prediction Time
0.56s
Total Time
All Input Parameters
{ "top_p": 1, "prompt": "Hello, Llama!", "max_tokens": 1024, "temperature": 0.6, "presence_penalty": 0, "frequency_penalty": 0 }
Input Parameters
- top_p
- Top-p (nucleus) sampling
- prompt
- Prompt
- max_tokens
- The maximum number of tokens the model should generate as output.
- temperature
- The value used to modulate the next token probabilities.
- presence_penalty
- Presence penalty
- frequency_penalty
- Frequency penalty
Output Schema
Output
Example Execution Logs
Prompt: Hello, Llama! Input token count: 5 Output token count: 29 TTFT: 0.23s Tokens per second: 53.68 Total time: 0.54s
Version Details
- Version ID
8137b3975c483beadba5dbb424eb6824248cdc8a6746abbcfafb96886e82965f
- Version Created
- April 5, 2025