tomasmcm/tinyllama-1.1b-chat-v1.0 📝🔢 → 📝

▶️ 114 runs 📅 Jan 2024 ⚙️ Cog 0.8.6 📄 Paper ⚖️ License
chatbot conversational-ai dialogue-generation language-model llama text-generation

About

Source: TinyLlama/TinyLlama-1.1B-Chat-v1.0 ✦ Quant: TheBloke/TinyLlama-1.1B-Chat-v1.0-AWQ ✦ The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Example Output

Prompt:

"<|system|>
You are a friendly chatbot who always responds in the style of a pirate.
<|user|>
How many helicopters can a human eat in one sitting?
<|assistant|>"

Output

There's no definitive answer to this question since humans and helicopters are two completely different entities. However, it's safe to say that humans can consume a lot of food while flying in a helicopter. In fact, some airlines offer private helicopter services that allow passengers to enjoy a multi-course meal in the air. The amount of food consumed in one sitting would be in the thousands or millions of calories, depending on the passenger's size and dietary needs.

Performance Metrics

2.43s Prediction Time
267.88s Total Time
All Input Parameters
{
  "top_k": -1,
  "top_p": 0.95,
  "prompt": "<|system|>\nYou are a friendly chatbot who always responds in the style of a pirate.</s>\n<|user|>\nHow many helicopters can a human eat in one sitting?</s>\n<|assistant|>",
  "max_tokens": 128,
  "temperature": 0.8,
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
stop Type: string
List of strings that stop the generation when they are generated. The returned output will not contain the stop strings.
top_k Type: integerDefault: -1
Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens.
top_p Type: numberDefault: 0.95Range: 0.01 - 1
Float that controls the cumulative probability of the top tokens to consider. Must be in (0, 1]. Set to 1 to consider all tokens.
prompt (required) Type: string
Text prompt to send to the model.
max_tokens Type: integerDefault: 128
Maximum number of tokens to generate per output sequence.
temperature Type: numberDefault: 0.8Range: 0.01 - 5
Float that controls the randomness of the sampling. Lower values make the model more deterministic, while higher values make the model more random. Zero means greedy sampling.
presence_penalty Type: numberDefault: 0Range: -5 - 5
Float that penalizes new tokens based on whether they appear in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
frequency_penalty Type: numberDefault: 0Range: -5 - 5
Float that penalizes new tokens based on their frequency in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
Output Schema

Output

Type: string

Example Execution Logs
Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s]
Processed prompts: 100%|██████████| 1/1 [00:02<00:00,  2.41s/it]
Processed prompts: 100%|██████████| 1/1 [00:02<00:00,  2.41s/it]
Generated 109 tokens in 2.417113780975342 seconds.
Version Details
Version ID
09826dce769fd88da97a6d94f528fd8fe611e8e748acb842d2c6da0b755c5952
Version Created
January 3, 2024
Run on Replicate →