lucataco/qwen2-57b-a14b-instruct 🔢📝 → 📝
About
Qwen2 57 billion parameter language model from Alibaba Cloud, fine tuned for chat completions

Example Output
Prompt:
"Give me a short introduction to large language model."
Output
A large language model (LLM) is a type of artificial intelligence model that is trained on a massive amount of text data to generate human-like text. These models are typically trained using deep learning techniques, and they are able to generate text that is coherent and contextually appropriate, making them useful for a variety of natural language processing tasks. Some common applications of large language models include language translation, text summarization, and question answering. They are also used in chatbots and virtual assistants to enable more natural and realistic conversations with users. Large language models are often referred to as "generative models" because they are able to generate new text based on the patterns they have learned from the training data.
Performance Metrics
2.11s
Prediction Time
253.87s
Total Time
All Input Parameters
{ "top_k": 50, "top_p": 0.9, "prompt": "Give me a short introduction to large language model.", "max_tokens": 512, "min_tokens": 0, "temperature": 0.6, "system_prompt": "You are a helpful assistant.", "presence_penalty": 0, "frequency_penalty": 0 }
Input Parameters
- top_k
- The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
- top_p
- A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
- prompt
- Prompt
- max_tokens
- The maximum number of tokens the model should generate as output.
- min_tokens
- The minimum number of tokens the model should generate as output.
- temperature
- The value used to modulate the next token probabilities.
- system_prompt
- System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
- stop_sequences
- A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
- presence_penalty
- Presence penalty
- frequency_penalty
- Frequency penalty
Output Schema
Output
Example Execution Logs
INFO 07-02 15:28:31 async_llm_engine.py:529] Received request 98ac8d73e2454af68ca1cb481bc48be2: prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nGive me a short introduction to large language model.<|im_end|>\n<|im_start|>assistant\n', sampling_params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[151645], include_stop_str_in_output=False, ignore_eos=False, max_tokens=512, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: None, lora_request: None. stdoutGeneration took 1719933329.48sFormatted prompt: <|im_start|>system You are a helpful assistant.<|im_end|> <|im_start|>user Give me a short introduction to large language model.<|im_end|> <|im_start|>assistant INFO 07-02 15:28:33 async_llm_engine.py:120] Finished request 98ac8d73e2454af68ca1cb481bc48be2. stdout
Version Details
- Version ID
fc67fa3fa20d3d0ee59794df05548b59d285fbb43d944506203a8a2195b75c36
- Version Created
- July 2, 2024