tomasmcm/llama-3-8b-instruct-gradient-4194k 📝🔢 → 📝

▶️ 149 runs 📅 May 2024 ⚙️ Cog 0.8.6 📄 Paper ⚖️ License
ai-assistants code-generation data-analysis language-model long-context text-generation

About

Source: gradientai/Llama-3-8B-Instruct-Gradient-4194k ✦ Quant: solidrust/Llama-3-8B-Instruct-Gradient-4194k-AWQ ✦ Extending LLama-3 8B's context length from 8k to 4194K

Example Output

Prompt:

"<|start_header_id|>system<|end_header_id|>
You are a helpful assistant. Perform the task to the best of your ability.<|eot_id|>
<|start_header_id|>user<|end_header_id|>
You're standing on the surface of the Earth. You walk one mile south, one mile west and one mile north. You end up exactly where you started. Where are you?<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>"

Output

I would be back where I started!

Performance Metrics

0.13s Prediction Time
21.69s Total Time
All Input Parameters
{
  "stop": "</s>",
  "top_k": -1,
  "top_p": 0.95,
  "prompt": "<|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant. Perform the task to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\nYou're standing on the surface of the Earth. You walk one mile south, one mile west and one mile north. You end up exactly where you started. Where are you?<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n",
  "max_tokens": 1024,
  "temperature": 0.8,
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
stop Type: string
List of strings that stop the generation when they are generated. The returned output will not contain the stop strings.
top_k Type: integerDefault: -1
Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens.
top_p Type: numberDefault: 0.95Range: 0.01 - 1
Float that controls the cumulative probability of the top tokens to consider. Must be in (0, 1]. Set to 1 to consider all tokens.
prompt (required) Type: string
Text prompt to send to the model.
max_tokens Type: integerDefault: 128
Maximum number of tokens to generate per output sequence.
temperature Type: numberDefault: 0.8Range: 0.01 - 5
Float that controls the randomness of the sampling. Lower values make the model more deterministic, while higher values make the model more random. Zero means greedy sampling.
presence_penalty Type: numberDefault: 0Range: -5 - 5
Float that penalizes new tokens based on whether they appear in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
frequency_penalty Type: numberDefault: 0Range: -5 - 5
Float that penalizes new tokens based on their frequency in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
Output Schema

Output

Type: string

Example Execution Logs
Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s]
Processed prompts: 100%|██████████| 1/1 [00:00<00:00,  8.27it/s]
Processed prompts: 100%|██████████| 1/1 [00:00<00:00,  8.26it/s]
Generated 10 tokens in 0.1239476203918457 seconds.
Version Details
Version ID
18b7a95a92c796e31fb118bcf70557f91dd4cf72c466cdc04ddce394331b09ac
Version Created
May 16, 2024
Run on Replicate →