ibm-granite/granite-20b-code-instruct-8k 🔢📝 → 📝

⭐ Official ▶️ 110.0K runs 📅 Aug 2024 ⚙️ Cog 0.10.0-alpha11 🔗 GitHub 📄 Paper ⚖️ License

code-generation question-answering text-generation

About

Join the Granite community where you can find numerous recipe workbooks to help you get started with a wide variety of use cases using this model. https://github.com/ibm-granite-community

Example Output

Prompt:

"Could you please explain what APR means?"

Output

APR stands for Annual Percentage Rate. It is a figure that represents the annual cost of borrowing, including fees and interest. It is used to help consumers compare the costs of different loans and credit products. The higher the APR, the more expensive the loan is.

Performance Metrics

1.18s Prediction Time

1.18s Total Time

All Input Parameters

{
  "top_p": 0.9,
  "prompt": "Could you please explain what APR means?",
  "max_tokens": 512,
  "min_tokens": 0,
  "temperature": 0.6,
  "system_prompt": "You are an expert in finance that knows many concepts related to loans and credit.",
  "presence_penalty": 0,
  "frequency_penalty": 0
}

Input Parameters

seed Type: integer: Random seed. Leave blank to randomize the seed.
top_k Type: integerDefault: 50: The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9: A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:: Prompt
max_tokens Type: integerDefault: 512: The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0: The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6: The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are a helpful assistant.: System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
stop_sequences Type: string: A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
prompt_template Type: string: A template to format the prompt with. If not provided, the default prompt template will be used.
presence_penalty Type: numberDefault: 0: Presence penalty
frequency_penalty Type: numberDefault: 0: Frequency penalty

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

INFO 10-17 16:34:20 async_llm_engine.py:173] Added request f21156ed57234d8585b9b09777df8582.
INFO 10-17 16:34:21 metrics.py:396] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.
Generation took 1729182595.16sFormatted prompt: System:
You are an expert in finance that knows many concepts related to loans and credit.

Question:
Could you please explain what APR means?

Answer:
Random seed used: `99346`
Note: Random seed will not impact output if greedy decoding is used.
INFO 10-17 16:34:22 async_llm_engine.py:140] Finished request f21156ed57234d8585b9b09777df8582.

Version Details

Version ID: 409a0c68b74df416c7ae2a3f1552101123356f5a2c6e46d681629b62904c605b
Version Created: August 20, 2024

Run on Replicate →