ibm-granite/granite-4.0-h-small 🔢📝❓✓ → 📝
About
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.

Example Output
"What is Pi?"
Output
Pi (π) is a mathematical constant that represents the ratio of a circle's circumference to its diameter. It is an irrational number, which means it cannot be expressed as a simple fraction and its decimal representation goes on forever without repeating.
The value of pi is approximately 3.14159, but it has been calculated to trillions of digits beyond the decimal point. For most practical purposes, using pi to a few decimal places (such as 3.14159 or 3.14) is sufficient.
Pi is used in various mathematical and scientific calculations, including:
Geometry: Calculating the circumference, area, and volume of circles, spheres, and cylinders.
Trigonometry: Defining the sine, cosine, and tangent functions.
Calculus: In integrals and series expansions.
Physics: Describing wave motion, oscillations, and circular motion.
Engineering: Designing structures and systems involving circular or spherical components.
The symbol π was first used to represent this constant by Welsh mathematician William Jones in 1706, and it was later popularized by Swiss mathematician Leonhard Euler.
Performance Metrics
All Input Parameters
{ "stop": [], "tools": [], "top_k": 50, "top_p": 0.9, "prompt": "What is Pi?", "stream": false, "messages": [], "documents": [], "min_tokens": 0, "temperature": 0.6, "presence_penalty": 0, "frequency_penalty": 0, "chat_template_kwargs": {}, "add_generation_prompt": true }
Input Parameters
- seed
- Random seed. Leave unspecified to randomize the seed.
- stop
- A list of sequences to stop generation at. For example, ["<end>","<stop>"] will stop generation at the first instance of "<end>" or "<stop>".
- tools
- Tools for request. Passed to the chat template.
- top_k
- The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
- top_p
- A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
- prompt
- Completion API user prompt.
- stream
- Request streaming response. Defaults to False.
- messages
- Chat completion API messages.
- documents
- Documents for request. Passed to the chat template.
- max_tokens
- max_tokens is deprecated in favor of the max_completion_tokens field.
- min_tokens
- The minimum number of tokens the model should generate as output.
- temperature
- The value used to modulate the next token probabilities.
- tool_choice
- Tool choice for request. If the choice is a specific function, this should be specified as a JSON string.
- chat_template
- A template to format the prompt with. If not specified, the chat template provided by the model will be used.
- system_prompt
- Completion API system prompt. The chat template provides a good default.
- response_format
- An object specifying the format that the model must output.
- presence_penalty
- Presence penalty
- frequency_penalty
- Frequency penalty
- chat_template_kwargs
- Additional arguments to be passed to the chat template.
- add_generation_prompt
- Add generation prompt. Passed to the chat template. Defaults to True.
- max_completion_tokens
- An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
Output Schema
Output
Example Execution Logs
INFO 09-25 18:36:18 [logger.py:40] Received request cmpl-1-0: prompt: None, params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=512, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None), prompt_token_ids: None, prompt_embeds shape: None, lora_request: None. DEBUG 09-25 18:36:22 [predict.py:558] finish_reason=stop response_text=Pi (π) is a mathematical constant that represents the ratio of a circle's circumference to its diameter. It is an irrational number, which means it cannot be expressed as a simple fraction and its decimal representation goes on forever without repeating. The value of pi is approximately 3.14159, but it has been calculated to trillions of digits beyond the decimal point. For most practical purposes, using pi to a few decimal places (such as 3.14159 or 3.14) is sufficient. Pi is used in various mathematical and scientific calculations, including: 1. Geometry: Calculating the circumference, area, and volume of circles, spheres, and cylinders. 2. Trigonometry: Defining the sine, cosine, and tangent functions. 3. Calculus: In integrals and series expansions. 4. Physics: Describing wave motion, oscillations, and circular motion. 5. Engineering: Designing structures and systems involving circular or spherical components. The symbol π was first used to represent this constant by Welsh mathematician William Jones in 1706, and it was later popularized by Swiss mathematician Leonhard Euler. INFO 09-25 18:36:22 [predict.py:569] Generation took 3.89s DEBUG 09-25 18:36:22 [predict.py:572] prompt_tokens=12, completion_tokens=232, total_tokens=244 /srv/r8/monobase/cog/latest/cogletlatest-python3.12/lib/python3.12/site-packages/coglet/scope.py:35: ExperimentalFeatureWarning: current_scope is an experimental internal function. It may change or be removed without warning. warnings.warn( INFO 09-25 18:36:22 [predict.py:585] predict() complete
Version Details
- Version ID
79cd422742e4dbd543569aae06db8d5c55bc932153501c2f1bd6213e78cd427d
- Version Created
- October 7, 2025