meta/meta-llama-3-8b-instruct 🔢📝 → 📝
About
An 8 billion parameter language model from Meta, fine tuned for chat completions

Example Output
Prompt:
"Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?"
Output
The number of parameters in a neural network can impact its speed, but it's not the only factor.
In general, a larger number of parameters can lead to:
- Increased computational complexity: More parameters mean more calculations are required to process the data.
- Increased memory requirements: Larger models require more memory to store their parameters, which can impact system performance.
However, it's worth noting that the relationship between the number of parameters and speed is not always linear. Other factors, such as:
- Model architecture
- Optimizer choice
- Hyperparameter tuning
can also impact the speed of a neural network.
In the case of Johnny and Tommy, it's difficult to say which one's model will be faster without more information about the models themselves.
Performance Metrics
1.65s
Prediction Time
1.66s
Total Time
All Input Parameters
{ "top_p": 0.95, "prompt": "Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?", "temperature": 0.7, "system_prompt": "You are a helpful assistant", "length_penalty": 1, "max_new_tokens": 512, "stop_sequences": "<|end_of_text|>,<|eot_id|>", "prompt_template": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", "presence_penalty": 0 }
Input Parameters
- top_k
- The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
- top_p
- A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
- prompt
- Prompt
- max_tokens
- The maximum number of tokens the model should generate as output.
- min_tokens
- The minimum number of tokens the model should generate as output.
- temperature
- The value used to modulate the next token probabilities.
- prompt_template
- Prompt template. The string `{prompt}` will be substituted for the input prompt. If you want to generate dialog output, use this template as a starting point and construct the prompt string manually, leaving `prompt_template={prompt}`.
- presence_penalty
- Presence penalty
- frequency_penalty
- Frequency penalty
Output Schema
Output
Example Execution Logs
Random seed used: `57440` Note: Random seed will not impact output if greedy decoding is used. Formatted prompt: `<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|> Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?<|eot_id|><|start_header_id|>assistant<|end_header_id|> `Random seed used: `57440` Note: Random seed will not impact output if greedy decoding is used. Formatted prompt: `<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|> Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?<|eot_id|><|start_header_id|>assistant<|end_header_id|> `
Version Details
- Version ID
5a6809ca6288247d06daf6365557e5e429063f32a21146b2a807c682652136b8
- Version Created
- April 17, 2024