moinnadeem/vllm-engine-llama-7b ✓🔢📝 → 📝

▶️ 678 runs 📅 Sep 2023 ⚙️ Cog 0.8.6
code-generation text-generation text-to-sql

About

Example Output

Prompt:

"

You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.

You must output the SQL query that answers the question.

Input:

What is the total number of decile for the redwood school locality?

Context:

CREATE TABLE table_name_34 (decile VARCHAR, name VARCHAR)

Response:

"

Output

SELECT COUNT(decile) FROM table_name_34 WHERE name = "redwood school"

Performance Metrics

3.74s Prediction Time
3.70s Total Time
All Input Parameters
{
  "debug": false,
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables. \n\nYou must output the SQL query that answers the question.\n\n### Input:\nWhat is the total number of decile for the redwood school locality?\n\n### Context:\nCREATE TABLE table_name_34 (decile VARCHAR, name VARCHAR)\n\n### Response:\n",
  "lora_path": "https://pub-df34620a84bb4c0683fae07a260df1ea.r2.dev/sql.zip",
  "temperature": 0.75,
  "max_new_tokens": 128
}
Input Parameters
debug Type: booleanDefault: false
provide debugging output in logs
top_k Type: integerDefault: 50Range: 0 - ∞
When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens
top_p Type: numberDefault: 0.9Range: 0 - 1
When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
prompt (required) Type: string
Prompt to send to CodeLlama.
lora_path Type: stringDefault: https://pub-df34620a84bb4c0683fae07a260df1ea.r2.dev/sql.zip
Path to .zip of LoRA weights.
temperature Type: numberDefault: 0.75Range: 0.01 - 5
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
max_new_tokens Type: integerDefault: 128Range: 1 - ∞
Maximum number of tokens to generate. A word is generally 2-3 tokens
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
Weights path: https://pub-df34620a84bb4c0683fae07a260df1ea.r2.dev/sql.zip
Downloading peft weights
using https://pub-df34620a84bb4c0683fae07a260df1ea.r2.dev/sql.zip instead of https://pub-df34620a84bb4c0683fae07a260df1ea.r2.dev/sql.zip
Downloaded sql.zip as 12 519 kB chunks in 1.0834 with 0 retries
Downloaded peft weights in 1.084
Unzipped peft weights in 0.083
Data keys: dict_keys(['adapter_config.json', 'adapter_model.bin'])
Prompt:
You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.
You must output the SQL query that answers the question.
### Input:
What is the total number of decile for the redwood school locality?
### Context:
CREATE TABLE table_name_34 (decile VARCHAR, name VARCHAR)
### Response:
INFO 09-25 18:32:09 async_llm_engine.py:371] Received request 0: prompt: 'You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables. \n\nYou must output the SQL query that answers the question.\n\n### Input:\nWhat is the total number of decile for the redwood school locality?\n\n### Context:\nCREATE TABLE table_name_34 (decile VARCHAR, name VARCHAR)\n\n### Response:\n', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.75, top_p=0.9, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None), prompt token ids: None.
INFO 09-25 18:32:09 llm_engine.py:623] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.4%, CPU KV cache usage: 0.0%
INFO 09-25 18:32:09 async_llm_engine.py:111] Finished request 0.
Generated text: SELECT COUNT(decile) FROM table_name_34 WHERE name = "redwood school"
Generated 22 tokens in 0.416 seconds (52.948 tokens per second)
Version Details
Version ID
559ff8c30789d100c13f9bd0f831210f6f9c6f1c81dab06ff16dc61dbfa94b03
Version Created
September 25, 2023
Run on Replicate →