moinnadeem/codellama-34b-instruct-vllm 🔢✓📝 → 📝

▶️ 80 runs 📅 Sep 2023 ⚙️ Cog 0.8.6

Example Output

Prompt:

"Write a Python function downloads a html page from the internet and extracts the text of the h1 elements"

Output

import requests
from bs4 import BeautifulSoup

def download_h1_text(url):
   response = requests.get(url)
   soup = BeautifulSoup(response.content, 'html.parser')
   h1_elements = soup.find_all('h1')
   text = [element.text for element in h1_elements]
   return text

Performance Metrics

4.29s Prediction Time

134.19s Total Time

All Input Parameters

{
  "debug": false,
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "Write a Python function downloads a html page from the internet and extracts the text of the h1 elements",
  "temperature": 0.75,
  "system_prompt": "You are a helpful assistant.",
  "max_new_tokens": 509,
  "min_new_tokens": -1
}

Input Parameters

seed Type: integer: Random seed. Leave blank to randomize the seed
debug Type: booleanDefault: false: provide debugging output in logs
top_k Type: integerDefault: 50Range: 0 - ∞: When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens
top_p Type: numberDefault: 0.9Range: 0 - 1: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
prompt (required) Type: string: Prompt to send to the model.
temperature Type: numberDefault: 0.75Range: 0.01 - 5: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
system_prompt Type: stringDefault: You are a helpful assistant.: System prompt to send to the model. This is prepended to the prompt and helps guide system behavior.
max_new_tokens Type: integerDefault: 128Range: 1 - ∞: Maximum number of tokens to generate. A word is generally 2-3 tokens
min_new_tokens Type: integerDefault: -1Range: -1 - ∞: Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.
stop_sequences Type: string: A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
replicate_weights Type: string: Path to fine-tuned weights produced by a Replicate fine-tune job.

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

Your formatted prompt is:
[INST] <<SYS>>
You are a helpful assistant.
<</SYS>>
Write a Python function downloads a html page from the internet and extracts the text of the h1 elements [/INST]
INFO 09-28 06:31:59 async_llm_engine.py:328] Received request 0: prompt: '[INST] <<SYS>>\nYou are a helpful assistant.\n<</SYS>>\n\nWrite a Python function downloads a html page from the internet and extracts the text of the h1 elements [/INST]', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.75, top_p=0.9, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=509, logprobs=None), prompt token ids: None.
INFO 09-28 06:31:59 llm_engine.py:613] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.3%, CPU KV cache usage: 0.0%
INFO 09-28 06:32:03 async_llm_engine.py:100] Finished request 0.
Generated text:  ```
import requests
from bs4 import BeautifulSoup
def download_h1_text(url):
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
h1_elements = soup.find_all('h1')
text = [element.text for element in h1_elements]
return text
```
hostname: model-c97256d5-c3f3ad7eed055505-gpu-a100-7b6bbcfb6c-hkfd7

Version Details

Version ID: 49f07fcebd9e71cc2c451fd5d0667602987aa77f8ed9c6d63b8ab8ef97739fe7
Version Created: September 28, 2023

Run on Replicate →