lucataco/tinyllama-1.1b-chat-v1.0 🔢📝 → 📝
About
This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
Example Output
Prompt:
"How many helicopters can a human eat in one sitting?"
Output
There is no specific limit for how many helicopters a human can eat in one sitting, but it is not recommended to consume large quantities of food or drink while operating helicopters. It is always best to wait until you are fully hydrated and rested before consuming any food or drink. Eating too quickly or eating in large portions may lead to digestive issues such as bloating, gas, or constipation.
Performance Metrics
0.82s
Prediction Time
0.85s
Total Time
All Input Parameters
{
"top_k": 50,
"top_p": 0.95,
"prompt": "How many helicopters can a human eat in one sitting?",
"temperature": 0.7,
"system_prompt": "You are a friendly chatbot who always responds in the style of a pirate",
"max_new_tokens": 256,
"prompt_template": "<|system|>\n{system_prompt}</s>\n<|user|>\n{prompt}</s>\n<|assistant|>"
}
Input Parameters
- top_k
- The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering)
- top_p
- This parameter controls how many of the highest-probability words are selected to be included in the generated text
- prompt
- Instruction for model
- temperature
- The value used to modulate the next token probabilities.
- system_prompt
- System prompt for the model, helps guides model behaviour.
- max_new_tokens
- The maximum number of tokens the model should generate as output
- prompt_template
- Template to pass to model. Override if you are providing multi-turn instructions.
Output Schema
Output
Example Execution Logs
Llama.generate: prefix-match hit llama_print_timings: load time = 509.95 ms llama_print_timings: sample time = 14.29 ms / 96 runs ( 0.15 ms per token, 6715.63 tokens per second) llama_print_timings: prompt eval time = 0.00 ms / 1 tokens ( 0.00 ms per token, inf tokens per second) llama_print_timings: eval time = 549.94 ms / 96 runs ( 5.73 ms per token, 174.56 tokens per second) llama_print_timings: total time = 724.48 ms
Version Details
- Version ID
c35e854d4c98967173aaf3c8936f79523a8018989cd232024bd37360880ad59b- Version Created
- January 4, 2024