lucataco/tinyllama-1.1b-chat-v1.0 🔢📝 → 📝

▶️ 648 runs 📅 Jan 2024 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License

chatbot instruction-following text-generation

About

This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

Example Output

Prompt:

"How many helicopters can a human eat in one sitting?"

Output

There is no specific limit for how many helicopters a human can eat in one sitting, but it is not recommended to consume large quantities of food or drink while operating helicopters. It is always best to wait until you are fully hydrated and rested before consuming any food or drink. Eating too quickly or eating in large portions may lead to digestive issues such as bloating, gas, or constipation.

Performance Metrics

0.82s Prediction Time

0.85s Total Time

All Input Parameters

{
  "top_k": 50,
  "top_p": 0.95,
  "prompt": "How many helicopters can a human eat in one sitting?",
  "temperature": 0.7,
  "system_prompt": "You are a friendly chatbot who always responds in the style of a pirate",
  "max_new_tokens": 256,
  "prompt_template": "<|system|>\n{system_prompt}</s>\n<|user|>\n{prompt}</s>\n<|assistant|>"
}

Input Parameters

top_k Type: integerDefault: 50: The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering)
top_p Type: numberDefault: 0.95: This parameter controls how many of the highest-probability words are selected to be included in the generated text
prompt Type: stringDefault: How many helicopters can a human eat in one sitting?: Instruction for model
temperature Type: numberDefault: 0.7: The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are a friendly chatbot who always responds in the style of a pirate: System prompt for the model, helps guides model behaviour.
max_new_tokens Type: integerDefault: 256: The maximum number of tokens the model should generate as output
prompt_template Type: stringDefault: <|system|> {system_prompt}</s> <|user|> {prompt}</s> <|assistant|>: Template to pass to model. Override if you are providing multi-turn instructions.

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

Llama.generate: prefix-match hit
llama_print_timings:        load time =     509.95 ms
llama_print_timings:      sample time =      14.29 ms /    96 runs   (    0.15 ms per token,  6715.63 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =     549.94 ms /    96 runs   (    5.73 ms per token,   174.56 tokens per second)
llama_print_timings:       total time =     724.48 ms

Version Details

Version ID: c35e854d4c98967173aaf3c8936f79523a8018989cd232024bd37360880ad59b
Version Created: January 4, 2024

Run on Replicate →