nateraw/llama-2-7b-samsum 🔢✓📝 → 📝

▶️ 36 runs 📅 Nov 2023 ⚙️ Cog 0.8.6
dialogue-summarization document-summarization text-generation

About

Example Output

Prompt:

"

[INST] <>
Use the Input to provide a summary of a conversation.
<
>

Input:
Gary: Hey, don't forget about Tom's bday party!
Lara: I won't! What time should I show up?
Gary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.
Lara: You're such a great boyfriend. He will be so happy!
Gary: Yep, I am :)
Lara: So I'll just pick up the cake and get the balloons...
Gary: Thanks, you're so helpful. I've already paid for the cake.
Lara: No problem, see you at 5 pm!
Gary: See you! [/INST]

Summary:

"

Output

Lara will pick up the cake and balloons for Tom's bday party and meet Gary at 5 pm.

Performance Metrics

2.00s Prediction Time
6.92s Total Time
All Input Parameters
{
  "debug": false,
  "top_p": 0.95,
  "prompt": "[INST] <<SYS>>\nUse the Input to provide a summary of a conversation.\n<</SYS>>\n\nInput:\nGary: Hey, don't forget about Tom's bday party!\nLara: I won't! What time should I show up?\nGary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.\nLara: You're such a great boyfriend. He will be so happy!\nGary: Yep, I am :)\nLara: So I'll just pick up the cake and get the balloons...\nGary: Thanks, you're so helpful. I've already paid for the cake.\nLara: No problem, see you at 5 pm!\nGary: See you! [/INST]\n\nSummary: ",
  "temperature": 0.7,
  "return_logits": false,
  "max_new_tokens": 128,
  "min_new_tokens": -1,
  "repetition_penalty": 1.15
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed
debug Type: booleanDefault: false
provide debugging output in logs
top_p Type: numberDefault: 0.95Range: 0 - 1
When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
prompt (required) Type: string
Prompt to send to the model.
temperature Type: numberDefault: 0.7Range: 0.01 - 5
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
return_logits Type: booleanDefault: false
if set, only return logits for the first token. only useful for testing, etc.
max_new_tokens Type: integerDefault: 128Range: 1 - ∞
Maximum number of tokens to generate. A word is generally 2-3 tokens
min_new_tokens Type: integerDefault: -1Range: -1 - ∞
Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
replicate_weights Type: string
Path to fine-tuned weights produced by a Replicate fine-tune job.
repetition_penalty Type: numberDefault: 1.15Range: 0 - ∞
A parameter that controls how repetitive text can be. Lower means more repetitive, while higher means less repetitive. Set to 1.0 to disable.
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
Your formatted prompt is:
[INST] <<SYS>>
Use the Input to provide a summary of a conversation.
<</SYS>>
Input:
Gary: Hey, don't forget about Tom's bday party!
Lara: I won't! What time should I show up?
Gary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.
Lara: You're such a great boyfriend. He will be so happy!
Gary: Yep, I am :)
Lara: So I'll just pick up the cake and get the balloons...
Gary: Thanks, you're so helpful. I've already paid for the cake.
Lara: No problem, see you at 5 pm!
Gary: See you! [/INST]
Summary:
previous weights were different, switching to https://replicate.delivery/pbxt/EY7Ew28BebUf6E4e2e1wgm2SnrQiFW1hFcN6vxZ2kIB8kTzHB/training_output.zip
Downloading peft weights
Downloaded training_output.zip as 1 8240 kB chunks in 0.401s with 0 retries
Downloaded peft weights in 0.620
Unzipped peft weights in 0.003
Initialized peft model in 0.060
Overall initialize_peft took 1.238
Exllama: False
INFO 11-28 07:55:55 async_llm_engine.py:371] Received request 0: prompt: "[INST] <<SYS>>\nUse the Input to provide a summary of a conversation.\n<</SYS>>\n\nInput:\nGary: Hey, don't forget about Tom's bday party!\nLara: I won't! What time should I show up?\nGary: Around 5 pm. He's supposed to be back home at 5:30, so we'll have just enough time to prep things up.\nLara: You're such a great boyfriend. He will be so happy!\nGary: Yep, I am :)\nLara: So I'll just pick up the cake and get the balloons...\nGary: Thanks, you're so helpful. I've already paid for the cake.\nLara: No problem, see you at 5 pm!\nGary: See you! [/INST]\n\nSummary: ", sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.7, top_p=0.95, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None, skip_special_tokens=True), prompt token ids: None.
INFO 11-28 07:55:55 llm_engine.py:631] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.4%, CPU KV cache usage: 0.0%
INFO 11-28 07:55:55 async_llm_engine.py:111] Finished request 0.
hostname: model-hs-77dde5d6-e494c37392ea209c-gpu-a40-7f448dcbc8-78nv9
Version Details
Version ID
7b38898d18f1ce5a1c51d0433e14542cf771cde1cbca4fcb68061a41c6723397
Version Created
November 28, 2023
Run on Replicate →