nateraw/llama-2-7b-samsum 🔢✓📝 → 📝
About
Example Output
Prompt:
"
[INST] <
Use the Input to provide a summary of a conversation.
<
Input:
Gary: Hey, don't forget about Tom's bday party!
Lara: I won't! What time should I show up?
Gary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.
Lara: You're such a great boyfriend. He will be so happy!
Gary: Yep, I am :)
Lara: So I'll just pick up the cake and get the balloons...
Gary: Thanks, you're so helpful. I've already paid for the cake.
Lara: No problem, see you at 5 pm!
Gary: See you! [/INST]
Summary:
"Output
Lara will pick up the cake and balloons for Tom's bday party and meet Gary at 5 pm.
Performance Metrics
2.00s
Prediction Time
6.92s
Total Time
All Input Parameters
{ "debug": false, "top_p": 0.95, "prompt": "[INST] <<SYS>>\nUse the Input to provide a summary of a conversation.\n<</SYS>>\n\nInput:\nGary: Hey, don't forget about Tom's bday party!\nLara: I won't! What time should I show up?\nGary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.\nLara: You're such a great boyfriend. He will be so happy!\nGary: Yep, I am :)\nLara: So I'll just pick up the cake and get the balloons...\nGary: Thanks, you're so helpful. I've already paid for the cake.\nLara: No problem, see you at 5 pm!\nGary: See you! [/INST]\n\nSummary: ", "temperature": 0.7, "return_logits": false, "max_new_tokens": 128, "min_new_tokens": -1, "repetition_penalty": 1.15 }
Input Parameters
- seed
- Random seed. Leave blank to randomize the seed
- debug
- provide debugging output in logs
- top_p
- When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
- prompt (required)
- Prompt to send to the model.
- temperature
- Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
- return_logits
- if set, only return logits for the first token. only useful for testing, etc.
- max_new_tokens
- Maximum number of tokens to generate. A word is generally 2-3 tokens
- min_new_tokens
- Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.
- stop_sequences
- A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
- replicate_weights
- Path to fine-tuned weights produced by a Replicate fine-tune job.
- repetition_penalty
- A parameter that controls how repetitive text can be. Lower means more repetitive, while higher means less repetitive. Set to 1.0 to disable.
Output Schema
Output
Example Execution Logs
Your formatted prompt is: [INST] <<SYS>> Use the Input to provide a summary of a conversation. <</SYS>> Input: Gary: Hey, don't forget about Tom's bday party! Lara: I won't! What time should I show up? Gary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up. Lara: You're such a great boyfriend. He will be so happy! Gary: Yep, I am :) Lara: So I'll just pick up the cake and get the balloons... Gary: Thanks, you're so helpful. I've already paid for the cake. Lara: No problem, see you at 5 pm! Gary: See you! [/INST] Summary: previous weights were different, switching to https://replicate.delivery/pbxt/EY7Ew28BebUf6E4e2e1wgm2SnrQiFW1hFcN6vxZ2kIB8kTzHB/training_output.zip Downloading peft weights Downloaded training_output.zip as 1 8240 kB chunks in 0.401s with 0 retries Downloaded peft weights in 0.620 Unzipped peft weights in 0.003 Initialized peft model in 0.060 Overall initialize_peft took 1.238 Exllama: False INFO 11-28 07:55:55 async_llm_engine.py:371] Received request 0: prompt: "[INST] <<SYS>>\nUse the Input to provide a summary of a conversation.\n<</SYS>>\n\nInput:\nGary: Hey, don't forget about Tom's bday party!\nLara: I won't! What time should I show up?\nGary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.\nLara: You're such a great boyfriend. He will be so happy!\nGary: Yep, I am :)\nLara: So I'll just pick up the cake and get the balloons...\nGary: Thanks, you're so helpful. I've already paid for the cake.\nLara: No problem, see you at 5 pm!\nGary: See you! [/INST]\n\nSummary: ", sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.7, top_p=0.95, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None, skip_special_tokens=True), prompt token ids: None. INFO 11-28 07:55:55 llm_engine.py:631] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.4%, CPU KV cache usage: 0.0% INFO 11-28 07:55:55 async_llm_engine.py:111] Finished request 0. hostname: model-hs-77dde5d6-e494c37392ea209c-gpu-a40-7f448dcbc8-78nv9
Version Details
- Version ID
7b38898d18f1ce5a1c51d0433e14542cf771cde1cbca4fcb68061a41c6723397
- Version Created
- November 28, 2023