johnnyoshika/llama2-combine-numbers 🔢✓📝 → 📝
About
Example Output
Prompt:
"What is 10+4?"
Output
10 + 4 = 14
Performance Metrics
0.27s
Prediction Time
2.46s
Total Time
All Input Parameters
{ "debug": false, "top_p": 0.95, "prompt": "What is 10+4?", "temperature": 0.7, "return_logits": false, "max_new_tokens": 128, "min_new_tokens": -1, "repetition_penalty": 1.15 }
Input Parameters
- seed
- Random seed. Leave blank to randomize the seed
- debug
- provide debugging output in logs
- top_p
- When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
- prompt (required)
- Prompt to send to the model.
- temperature
- Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
- return_logits
- if set, only return logits for the first token. only useful for testing, etc.
- max_new_tokens
- Maximum number of tokens to generate. A word is generally 2-3 tokens
- min_new_tokens
- Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.
- stop_sequences
- A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
- replicate_weights
- Path to fine-tuned weights produced by a Replicate fine-tune job.
- repetition_penalty
- A parameter that controls how repetitive text can be. Lower means more repetitive, while higher means less repetitive. Set to 1.0 to disable.
Output Schema
Output
Example Execution Logs
Your formatted prompt is: What is 10+4? correct lora is already loaded Overall initialize_peft took 0.000 Exllama: False INFO 05-19 01:05:42 async_llm_engine.py:371] Received request 0: prompt: 'What is 10+4?', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.7, top_p=0.95, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None, skip_special_tokens=True), prompt token ids: None. INFO 05-19 01:05:42 llm_engine.py:631] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0% INFO 05-19 01:05:42 async_llm_engine.py:111] Finished request 0. hostname: model-hp-77dde5d6c56598691b9008f7d123a18d-74856449d8-4m969
Version Details
- Version ID
3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a
- Version Created
- May 19, 2024