meta/meta-llama-3-8b-instruct 🔢📝 → 📝

⭐ Official ▶️ 385.1M runs 📅 Apr 2024 ⚙️ Cog 0.9.4 🔗 GitHub ⚖️ License

chat-completion code-generation question-answering text-generation

About

An 8 billion parameter language model from Meta, fine tuned for chat completions

Example Output

Prompt:

"Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?"

Output

The number of parameters in a neural network can impact its speed, but it's not the only factor.

In general, a larger number of parameters can lead to:

Increased computational complexity: More parameters mean more calculations are required to process the data.
Increased memory requirements: Larger models require more memory to store their parameters, which can impact system performance.

However, it's worth noting that the relationship between the number of parameters and speed is not always linear. Other factors, such as:

Model architecture
Optimizer choice
Hyperparameter tuning

can also impact the speed of a neural network.

In the case of Johnny and Tommy, it's difficult to say which one's model will be faster without more information about the models themselves.

Performance Metrics

1.65s Prediction Time

1.66s Total Time

All Input Parameters

{
  "top_p": 0.95,
  "prompt": "Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?",
  "temperature": 0.7,
  "system_prompt": "You are a helpful assistant",
  "length_penalty": 1,
  "max_new_tokens": 512,
  "stop_sequences": "<|end_of_text|>,<|eot_id|>",
  "prompt_template": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
  "presence_penalty": 0
}

Input Parameters

top_k Type: integerDefault: 50: The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9: A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:: Prompt
max_tokens Type: integerDefault: 512: The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0: The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6: The value used to modulate the next token probabilities.
prompt_template Type: stringDefault: {prompt}: Prompt template. The string `{prompt}` will be substituted for the input prompt. If you want to generate dialog output, use this template as a starting point and construct the prompt string manually, leaving `prompt_template={prompt}`.
presence_penalty Type: numberDefault: 1.15: Presence penalty
frequency_penalty Type: numberDefault: 0.2: Frequency penalty

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

Random seed used: `57440`
Note: Random seed will not impact output if greedy decoding is used.
Formatted prompt: `<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

`Random seed used: `57440`
Note: Random seed will not impact output if greedy decoding is used.
Formatted prompt: `<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

Johnny has 8 billion parameters. His friend Tommy has 70 billion parameters. What does this mean when it comes to speed?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

`

Version Details

Version ID: 5a6809ca6288247d06daf6365557e5e429063f32a21146b2a807c682652136b8
Version Created: April 17, 2024

Run on Replicate →