ibm-granite/granite-3.0-8b-instruct 🔢📝 → 📝

⭐ Official ▶️ 181.4K runs 📅 Oct 2024 ⚙️ Cog 0.10.0-alpha21 🔗 GitHub 📄 Paper ⚖️ License
code-generation text-generation text-translation

About

Granite-3.0-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

Example Output

Prompt:

"Could you please explain what APR means?"

Output

APR stands for Annual Percentage Rate. It is the annual interest rate charged for borrowing, expressed as a single percentage number that represents the actual yearly cost of funds over the term of a loan. It includes any fees or additional costs associated with the transaction.

Performance Metrics

3.01s Prediction Time
46.19s Total Time
All Input Parameters
{
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "Could you please explain what APR means?",
  "max_tokens": 512,
  "min_tokens": 0,
  "temperature": 0.6,
  "system_prompt": "You are an expert in finance that knows many concepts related to loans and credit.",
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:
Prompt
max_tokens Type: integerDefault: 512
The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0
The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are a helpful assistant.
System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
INFO:     ::1:39096 - "POST /v1/chat/completions HTTP/1.1" 200 OK
Generation took 3.01sFormatted prompt: Could you please explain what APR means?
Version Details
Version ID
8d8fb55950fb8eb2817fc078b7b05a0bd3ecc612d6332d8009fb0c007839192e
Version Created
October 16, 2024
Run on Replicate →