ibm-granite/granite-8b-code-instruct-128k 🔢📝 → 📝

⭐ Official ▶️ 550.5K runs 📅 Aug 2024 ⚙️ Cog 0.10.0-alpha11 🔗 GitHub 📄 Paper ⚖️ License
code-generation long-context text-generation

About

Join the Granite community where you can find numerous recipe workbooks to help you get started with a wide variety of use cases using this model. https://github.com/ibm-granite-community

Example Output

Prompt:

"Now, please explain APR meaning with a simple example, so that even a 5 years old kid can understand you."

Output

APR stands for Annual Percentage Rate. It's a way to show how much interest you'll pay on a loan over a year. Let's say you take out a loan of $100 at a 5% APR. That means you'll pay $5 in interest every year. So, after one year, you'll have paid $105. After two years, you'll have paid $110.25, and so on. The more you pay over the life of the loan, the more interest you'll pay in the end.

Performance Metrics

1.07s Prediction Time
1.07s Total Time
All Input Parameters
{
  "top_p": 0.9,
  "prompt": "Now, please explain APR meaning with a simple example, so that even a 5 years old kid can understand you.",
  "max_tokens": 512,
  "min_tokens": 0,
  "temperature": 0.6,
  "system_prompt": "You are an expert in finance that knows many concepts related to loans and credit.",
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed.
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:
Prompt
max_tokens Type: integerDefault: 512
The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0
The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are a helpful assistant.
System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
prompt_template Type: string
A template to format the prompt with. If not provided, the default prompt template will be used.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
INFO 10-17 16:37:11 async_llm_engine.py:173] Added request f1f5aa36bcc743ddaa7ded097c291c8b.
INFO 10-17 16:37:11 metrics.py:396] Avg prompt throughput: 0.5 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.
Generation took 1729182640.09sFormatted prompt: System:
You are an expert in finance that knows many concepts related to loans and credit.

Question:
Now, please explain APR meaning with a simple example, so that even a 5 years old kid can understand you.

Answer:
Random seed used: `56907`
Note: Random seed will not impact output if greedy decoding is used.
INFO 10-17 16:37:12 async_llm_engine.py:140] Finished request f1f5aa36bcc743ddaa7ded097c291c8b.
Version Details
Version ID
797c070dc871d8fca417d7d188cf050778d7ce21a0318d26711a54207e9ee698
Version Created
August 22, 2024
Run on Replicate →