paragekbote/gemma3-torchao-quant-sparse πŸ”’πŸ“πŸ–ΌοΈ β†’ πŸ“

▢️ 32 runs πŸ“… Sep 2025 βš™οΈ Cog 0.16.7
image-analysis image-to-text text-generation

About

An optimized gemma-3-4b setup with INT8 weight-only quantization, torch_compile and sparsity for efficient inference.

Example Output

Prompt:

"Describe the image in the photo. What type of breads are in the image and from what region are they?"

Output

Okay, let's break down the image and the likely types of bread and their potential origins.

Image Description:

The photo showcases a beautiful, almost gallery-like, display of a wide variety of freshly baked breads and pastries. It’s clearly in a bakery or a high-end food display area, likely in a hotel, restaurant, or a specialty shop. The focus is on the presentation and the quality of the goods.

Here's a breakdown of the types of breads and the possibilities of their origins, based on the image:

Types of Breads (Based on the Visuals):

  • Sourdough (Dark, Round Loaves): There are at least two dark, rustic-looking sourdough loaves. These are almost certainly from a European (particularly French or Scandinavian) tradition, known for their tangy flavor and dense texture.
  • Baguettes (Long, Thin Loaves): Several long, slender baguettes are present. These are a classic French bread.
  • Croissants (Triangle-Shaped Pastries): A large quantity of golden-brown croissants. These are, of course, a classic French pastry.
  • Pain au Chocolat (Dark, Layered Pastries): These look like a cross between a croissant and a chocolate-filled pastry, and are a common French/European treat.
  • Brioche (Round, Golden Loaves): These appear to be round, golden-brown loaves, likely made with a rich, buttery dough. Brioche is traditionally from France.
  • PΓ£o de Queijo (Cheese Rolls): A few of these are visible, and they're a specialty of Brazil.
  • Rolls (Various Shapes and Sizes): There are a number of rolls, some with a glaze or a sweet topping.
  • Pastries with Fruit (Red and Pink): A display of what appear to be small, fruit-filled pastries.

Possible Regional Origins (Based on the Bread Variety):

  • French Influence (Strong): The high presence of baguettes, croissants, and pain au chocolat strongly suggests a French influence. This is very common in many high-quality bakeries.
  • Scandinavian/European Sourdough: The dark, rustic sourdoughs likely point to a Scandinavian or European tradition, where these types of bread are

Performance Metrics

28.34s Prediction Time
131.82s Total Time
All Input Parameters
{
  "seed": 42,
  "top_p": 0.9,
  "prompt": "Describe the image in the photo. What type of breads are in the image and from what region are they?",
  "image_url": "https://images.pexels.com/photos/29380151/pexels-photo-29380151.jpeg",
  "temperature": 0.7,
  "use_sparsity": "true",
  "sparsity_type": "layer_norm",
  "max_new_tokens": 500,
  "sparsity_ratio": 0.3,
  "use_quantization": "true"
}
Input Parameters
seed Type: integerDefault: 42
Seed for reproducibility
top_p Type: numberDefault: 0.9Range: 0 - 1
Top-p nucleus sampling
prompt (required) Type: string
Input text prompt
image_url Type: string
Optional image URL
temperature Type: numberDefault: 0.7Range: 0 - 2
Sampling temperature
use_sparsity Type: stringDefault: false
Enable sparsity optimization
sparsity_type Type: stringDefault: magnitude
Type of sparsity: magnitude, gradual, layer_norm
max_new_tokens Type: integerDefault: 128Range: 1 - 2500
Maximum number of new tokens
sparsity_ratio Type: numberDefault: 0.3Range: 0 - 0.8
use_quantization Type: stringDefault: true
Enable INT8 quantization using torchao
Output Schema

Output

Type: string

Example Execution Logs
Applying layer_norm sparsity with ratio 0.3

Analyzing weight norms (fast method)...
Analyzed 285 layers
  Layer 'model.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj' (norm: 0.000027): pruned 396874/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj' (norm: 0.000012): pruned 396627/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj' (norm: 0.000022): pruned 397385/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj' (norm: 0.000021): pruned 397761/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj' (norm: 0.000014): pruned 398029/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj' (norm: 0.000019): pruned 397758/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj' (norm: 0.000021): pruned 397268/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj' (norm: 0.000016): pruned 397612/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj' (norm: 0.000021): pruned 397838/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj' (norm: 0.000020): pruned 397704/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj' (norm: 0.000018): pruned 397917/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj' (norm: 0.000020): pruned 397138/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj' (norm: 0.000020): pruned 397412/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj' (norm: 0.000018): pruned 398095/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj' (norm: 0.000020): pruned 397723/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj' (norm: 0.000020): pruned 396857/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj' (norm: 0.000018): pruned 397651/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj' (norm: 0.000020): pruned 396869/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj' (norm: 0.000019): pruned 396330/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj' (norm: 0.000018): pruned 396762/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj' (norm: 0.000019): pruned 396019/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj' (norm: 0.000019): pruned 397176/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj' (norm: 0.000018): pruned 397401/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj' (norm: 0.000019): pruned 396086/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj' (norm: 0.000019): pruned 397019/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj' (norm: 0.000019): pruned 398051/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj' (norm: 0.000020): pruned 395838/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj' (norm: 0.000020): pruned 395942/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj' (norm: 0.000018): pruned 396987/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj' (norm: 0.000020): pruned 397658/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj' (norm: 0.000020): pruned 396190/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj' (norm: 0.000018): pruned 397278/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj' (norm: 0.000020): pruned 397786/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj' (norm: 0.000019): pruned 395439/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj' (norm: 0.000018): pruned 396925/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj' (norm: 0.000020): pruned 397273/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj' (norm: 0.000019): pruned 398077/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj' (norm: 0.000018): pruned 397021/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj' (norm: 0.000020): pruned 396861/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj' (norm: 0.000020): pruned 397093/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj' (norm: 0.000018): pruned 397462/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj' (norm: 0.000020): pruned 397713/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj' (norm: 0.000020): pruned 397829/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj' (norm: 0.000018): pruned 397377/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj' (norm: 0.000019): pruned 396923/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj' (norm: 0.000020): pruned 395790/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj' (norm: 0.000018): pruned 398112/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj' (norm: 0.000020): pruned 398098/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj' (norm: 0.000019): pruned 397681/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj' (norm: 0.000018): pruned 398104/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj' (norm: 0.000019): pruned 396414/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj' (norm: 0.000019): pruned 397537/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj' (norm: 0.000019): pruned 396238/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj' (norm: 0.000019): pruned 395986/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj' (norm: 0.000019): pruned 396103/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj' (norm: 0.000019): pruned 395889/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj' (norm: 0.000019): pruned 397102/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj' (norm: 0.000019): pruned 396990/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj' (norm: 0.000019): pruned 395831/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj' (norm: 0.000019): pruned 396243/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj' (norm: 0.000019): pruned 395576/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj' (norm: 0.000019): pruned 396755/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj' (norm: 0.000019): pruned 396884/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj' (norm: 0.000019): pruned 396582/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj' (norm: 0.000019): pruned 395829/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj' (norm: 0.000019): pruned 397988/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj' (norm: 0.000019): pruned 396898/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj' (norm: 0.000020): pruned 395671/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj' (norm: 0.000019): pruned 395743/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj' (norm: 0.000019): pruned 396370/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj' (norm: 0.000020): pruned 395492/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj' (norm: 0.000019): pruned 396073/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.24.self_attn.k_proj' (norm: 0.000019): pruned 395655/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.24.self_attn.v_proj' (norm: 0.000021): pruned 397597/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.24.self_attn.q_proj' (norm: 0.000019): pruned 397599/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.25.self_attn.k_proj' (norm: 0.000020): pruned 396702/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.25.self_attn.v_proj' (norm: 0.000021): pruned 397998/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.25.self_attn.q_proj' (norm: 0.000019): pruned 397291/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj' (norm: 0.000022): pruned 396027/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj' (norm: 0.000021): pruned 397592/1327104 weights
  Layer 'model.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj' (norm: 0.000019): pruned 396151/1327104 weights
  Layer 'model.language_model.layers.0.self_attn.q_proj' (norm: 0.000008): pruned 1566745/5242880 weights
  Layer 'model.language_model.layers.0.self_attn.k_proj' (norm: 0.000014): pruned 784421/2621440 weights
  Layer 'model.language_model.layers.0.self_attn.v_proj' (norm: 0.000014): pruned 782653/2621440 weights
  Layer 'model.language_model.layers.0.mlp.gate_proj' (norm: 0.000004): pruned 7858057/26214400 weights
  Layer 'model.language_model.layers.0.mlp.up_proj' (norm: 0.000004): pruned 7863841/26214400 weights
  Layer 'model.language_model.layers.0.mlp.down_proj' (norm: 0.000001): pruned 7848683/26214400 weights
  Layer 'model.language_model.layers.1.self_attn.q_proj' (norm: 0.000005): pruned 1563891/5242880 weights
  Layer 'model.language_model.layers.1.self_attn.k_proj' (norm: 0.000020): pruned 784431/2621440 weights
  Layer 'model.language_model.layers.1.self_attn.v_proj' (norm: 0.000020): pruned 781005/2621440 weights
  Layer 'model.language_model.layers.1.mlp.gate_proj' (norm: 0.000004): pruned 7849222/26214400 weights
  Layer 'model.language_model.layers.1.mlp.up_proj' (norm: 0.000004): pruned 7851394/26214400 weights
  Layer 'model.language_model.layers.1.mlp.down_proj' (norm: 0.000001): pruned 7812359/26214400 weights
  Layer 'model.language_model.layers.2.self_attn.q_proj' (norm: 0.000006): pruned 1564813/5242880 weights
  Layer 'model.language_model.layers.2.self_attn.k_proj' (norm: 0.000018): pruned 783526/2621440 weights
  Layer 'model.language_model.layers.2.self_attn.v_proj' (norm: 0.000018): pruned 783793/2621440 weights
  Layer 'model.language_model.layers.2.mlp.gate_proj' (norm: 0.000004): pruned 7846299/26214400 weights
  Layer 'model.language_model.layers.2.mlp.up_proj' (norm: 0.000004): pruned 7864104/26214400 weights
  Layer 'model.language_model.layers.2.mlp.down_proj' (norm: 0.000001): pruned 7819838/26214400 weights
  Layer 'model.language_model.layers.3.self_attn.q_proj' (norm: 0.000005): pruned 1565343/5242880 weights
  Layer 'model.language_model.layers.3.self_attn.k_proj' (norm: 0.000020): pruned 783362/2621440 weights
  Layer 'model.language_model.layers.3.self_attn.v_proj' (norm: 0.000020): pruned 785068/2621440 weights
  Layer 'model.language_model.layers.3.mlp.gate_proj' (norm: 0.000004): pruned 7853632/26214400 weights
  Layer 'model.language_model.layers.3.mlp.up_proj' (norm: 0.000004): pruned 7837870/26214400 weights
  Layer 'model.language_model.layers.3.mlp.down_proj' (norm: 0.000002): pruned 7843756/26214400 weights
  Layer 'model.language_model.layers.4.self_attn.q_proj' (norm: 0.000006): pruned 1567364/5242880 weights
  Layer 'model.language_model.layers.4.self_attn.k_proj' (norm: 0.000016): pruned 785838/2621440 weights
  Layer 'model.language_model.layers.4.self_attn.v_proj' (norm: 0.000016): pruned 782560/2621440 weights
  Layer 'model.language_model.layers.4.mlp.gate_proj' (norm: 0.000004): pruned 7851182/26214400 weights
  Layer 'model.language_model.layers.4.mlp.up_proj' (norm: 0.000004): pruned 7838791/26214400 weights
  Layer 'model.language_model.layers.4.mlp.down_proj' (norm: 0.000001): pruned 7808281/26214400 weights
  Layer 'model.language_model.layers.5.self_attn.q_proj' (norm: 0.000004): pruned 1567668/5242880 weights
  Layer 'model.language_model.layers.5.self_attn.k_proj' (norm: 0.000024): pruned 784539/2621440 weights
  Layer 'model.language_model.layers.5.self_attn.v_proj' (norm: 0.000024): pruned 784890/2621440 weights
  Layer 'model.language_model.layers.5.mlp.gate_proj' (norm: 0.000004): pruned 7861828/26214400 weights
  Layer 'model.language_model.layers.5.mlp.up_proj' (norm: 0.000004): pruned 7852033/26214400 weights
  Layer 'model.language_model.layers.5.mlp.down_proj' (norm: 0.000001): pruned 7838535/26214400 weights
  Layer 'model.language_model.layers.6.self_attn.q_proj' (norm: 0.000005): pruned 1569561/5242880 weights
  Layer 'model.language_model.layers.6.self_attn.k_proj' (norm: 0.000020): pruned 783838/2621440 weights
  Layer 'model.language_model.layers.6.self_attn.v_proj' (norm: 0.000020): pruned 783498/2621440 weights
  Layer 'model.language_model.layers.6.mlp.gate_proj' (norm: 0.000004): pruned 7862780/26214400 weights
  Layer 'model.language_model.layers.6.mlp.up_proj' (norm: 0.000004): pruned 7846276/26214400 weights
  Layer 'model.language_model.layers.6.mlp.down_proj' (norm: 0.000002): pruned 7845041/26214400 weights
  Layer 'model.language_model.layers.7.self_attn.q_proj' (norm: 0.000008): pruned 1563312/5242880 weights
  Layer 'model.language_model.layers.7.self_attn.k_proj' (norm: 0.000013): pruned 784205/2621440 weights
  Layer 'model.language_model.layers.7.self_attn.v_proj' (norm: 0.000013): pruned 783809/2621440 weights
  Layer 'model.language_model.layers.7.mlp.gate_proj' (norm: 0.000004): pruned 7834968/26214400 weights
  Layer 'model.language_model.layers.7.mlp.up_proj' (norm: 0.000004): pruned 7859615/26214400 weights
  Layer 'model.language_model.layers.7.mlp.down_proj' (norm: 0.000002): pruned 7836583/26214400 weights
  Layer 'model.language_model.layers.8.self_attn.q_proj' (norm: 0.000008): pruned 1568054/5242880 weights
  Layer 'model.language_model.layers.8.self_attn.k_proj' (norm: 0.000013): pruned 785089/2621440 weights
  Layer 'model.language_model.layers.8.self_attn.v_proj' (norm: 0.000013): pruned 781716/2621440 weights
  Layer 'model.language_model.layers.8.mlp.gate_proj' (norm: 0.000004): pruned 7839262/26214400 weights
  Layer 'model.language_model.layers.8.mlp.up_proj' (norm: 0.000004): pruned 7863891/26214400 weights
  Layer 'model.language_model.layers.8.mlp.down_proj' (norm: 0.000002): pruned 7847451/26214400 weights
  Layer 'model.language_model.layers.9.self_attn.q_proj' (norm: 0.000005): pruned 1562532/5242880 weights
  Layer 'model.language_model.layers.9.self_attn.k_proj' (norm: 0.000020): pruned 783495/2621440 weights
  Layer 'model.language_model.layers.9.self_attn.v_proj' (norm: 0.000020): pruned 784156/2621440 weights
  Layer 'model.language_model.layers.9.mlp.gate_proj' (norm: 0.000004): pruned 7839204/26214400 weights
  Layer 'model.language_model.layers.9.mlp.up_proj' (norm: 0.000004): pruned 7850233/26214400 weights
  Layer 'model.language_model.layers.9.mlp.down_proj' (norm: 0.000002): pruned 7834925/26214400 weights
  Layer 'model.language_model.layers.10.self_attn.q_proj' (norm: 0.000008): pruned 1566576/5242880 weights
  Layer 'model.language_model.layers.10.self_attn.k_proj' (norm: 0.000014): pruned 784979/2621440 weights
  Layer 'model.language_model.layers.10.self_attn.v_proj' (norm: 0.000014): pruned 785066/2621440 weights
  Layer 'model.language_model.layers.10.mlp.gate_proj' (norm: 0.000004): pruned 7842217/26214400 weights
  Layer 'model.language_model.layers.10.mlp.up_proj' (norm: 0.000004): pruned 7848674/26214400 weights
  Layer 'model.language_model.layers.10.mlp.down_proj' (norm: 0.000001): pruned 7858524/26214400 weights
  Layer 'model.language_model.layers.11.self_attn.q_proj' (norm: 0.000005): pruned 1570390/5242880 weights
  Layer 'model.language_model.layers.11.self_attn.k_proj' (norm: 0.000021): pruned 784730/2621440 weights
  Layer 'model.language_model.layers.11.self_attn.v_proj' (norm: 0.000021): pruned 784219/2621440 weights
  Layer 'model.language_model.layers.11.mlp.gate_proj' (norm: 0.000004): pruned 7863042/26214400 weights
  Layer 'model.language_model.layers.11.mlp.up_proj' (norm: 0.000004): pruned 7860783/26214400 weights
  Layer 'model.language_model.layers.11.mlp.down_proj' (norm: 0.000001): pruned 7843536/26214400 weights
  Layer 'model.language_model.layers.12.self_attn.q_proj' (norm: 0.000006): pruned 1566366/5242880 weights
  Layer 'model.language_model.layers.12.self_attn.k_proj' (norm: 0.000018): pruned 780555/2621440 weights
  Layer 'model.language_model.layers.12.self_attn.v_proj' (norm: 0.000018): pruned 783562/2621440 weights
  Layer 'model.language_model.layers.12.mlp.gate_proj' (norm: 0.000004): pruned 7855918/26214400 weights
  Layer 'model.language_model.layers.12.mlp.up_proj' (norm: 0.000004): pruned 7839485/26214400 weights
  Layer 'model.language_model.layers.12.mlp.down_proj' (norm: 0.000002): pruned 7863318/26214400 weights
  Layer 'model.language_model.layers.13.self_attn.q_proj' (norm: 0.000005): pruned 1566527/5242880 weights
  Layer 'model.language_model.layers.13.self_attn.k_proj' (norm: 0.000023): pruned 783081/2621440 weights
  Layer 'model.language_model.layers.13.self_attn.v_proj' (norm: 0.000023): pruned 784072/2621440 weights
  Layer 'model.language_model.layers.13.mlp.gate_proj' (norm: 0.000004): pruned 7861441/26214400 weights
  Layer 'model.language_model.layers.13.mlp.up_proj' (norm: 0.000004): pruned 7858031/26214400 weights
  Layer 'model.language_model.layers.13.mlp.down_proj' (norm: 0.000002): pruned 7837261/26214400 weights
  Layer 'model.language_model.layers.14.self_attn.q_proj' (norm: 0.000005): pruned 1568322/5242880 weights
  Layer 'model.language_model.layers.14.self_attn.k_proj' (norm: 0.000020): pruned 786229/2621440 weights
  Layer 'model.language_model.layers.14.self_attn.v_proj' (norm: 0.000020): pruned 781660/2621440 weights
  Layer 'model.language_model.layers.14.mlp.gate_proj' (norm: 0.000004): pruned 7832633/26214400 weights
  Layer 'model.language_model.layers.14.mlp.up_proj' (norm: 0.000004): pruned 7849372/26214400 weights
  Layer 'model.language_model.layers.14.mlp.down_proj' (norm: 0.000002): pruned 7855301/26214400 weights
  Layer 'model.language_model.layers.15.self_attn.q_proj' (norm: 0.000005): pruned 1572146/5242880 weights
  Layer 'model.language_model.layers.15.self_attn.k_proj' (norm: 0.000023): pruned 786033/2621440 weights
  Layer 'model.language_model.layers.15.self_attn.v_proj' (norm: 0.000023): pruned 780821/2621440 weights
  Layer 'model.language_model.layers.15.mlp.gate_proj' (norm: 0.000004): pruned 7835666/26214400 weights
  Layer 'model.language_model.layers.15.mlp.up_proj' (norm: 0.000004): pruned 7843451/26214400 weights
  Layer 'model.language_model.layers.15.mlp.down_proj' (norm: 0.000001): pruned 7834040/26214400 weights
  Layer 'model.language_model.layers.16.self_attn.q_proj' (norm: 0.000006): pruned 1569784/5242880 weights
  Layer 'model.language_model.layers.16.self_attn.k_proj' (norm: 0.000017): pruned 786382/2621440 weights
  Layer 'model.language_model.layers.16.self_attn.v_proj' (norm: 0.000017): pruned 783502/2621440 weights
  Layer 'model.language_model.layers.16.mlp.gate_proj' (norm: 0.000004): pruned 7859682/26214400 weights
  Layer 'model.language_model.layers.16.mlp.up_proj' (norm: 0.000004): pruned 7861245/26214400 weights
  Layer 'model.language_model.layers.16.mlp.down_proj' (norm: 0.000002): pruned 7838314/26214400 weights
  Layer 'model.language_model.layers.17.self_attn.q_proj' (norm: 0.000008): pruned 1565697/5242880 weights
  Layer 'model.language_model.layers.17.self_attn.k_proj' (norm: 0.000013): pruned 783322/2621440 weights
  Layer 'model.language_model.layers.17.self_attn.v_proj' (norm: 0.000013): pruned 786170/2621440 weights
  Layer 'model.language_model.layers.17.mlp.gate_proj' (norm: 0.000004): pruned 7843671/26214400 weights
  Layer 'model.language_model.layers.17.mlp.up_proj' (norm: 0.000004): pruned 7845283/26214400 weights
  Layer 'model.language_model.layers.17.mlp.down_proj' (norm: 0.000002): pruned 7822550/26214400 weights
  Layer 'model.language_model.layers.18.self_attn.q_proj' (norm: 0.000007): pruned 1570967/5242880 weights
  Layer 'model.language_model.layers.18.self_attn.k_proj' (norm: 0.000016): pruned 780510/2621440 weights
  Layer 'model.language_model.layers.18.self_attn.v_proj' (norm: 0.000016): pruned 783913/2621440 weights
  Layer 'model.language_model.layers.18.mlp.gate_proj' (norm: 0.000004): pruned 7835222/26214400 weights
  Layer 'model.language_model.layers.18.mlp.up_proj' (norm: 0.000004): pruned 7860035/26214400 weights
  Layer 'model.language_model.layers.18.mlp.down_proj' (norm: 0.000001): pruned 7849580/26214400 weights
  Layer 'model.language_model.layers.19.self_attn.q_proj' (norm: 0.000008): pruned 1571370/5242880 weights
  Layer 'model.language_model.layers.19.self_attn.k_proj' (norm: 0.000014): pruned 785753/2621440 weights
  Layer 'model.language_model.layers.19.self_attn.v_proj' (norm: 0.000014): pruned 784999/2621440 weights
  Layer 'model.language_model.layers.19.mlp.gate_proj' (norm: 0.000004): pruned 7861778/26214400 weights
  Layer 'model.language_model.layers.19.mlp.up_proj' (norm: 0.000004): pruned 7840528/26214400 weights
  Layer 'model.language_model.layers.19.mlp.down_proj' (norm: 0.000001): pruned 7817092/26214400 weights
  Layer 'model.language_model.layers.20.self_attn.q_proj' (norm: 0.000005): pruned 1569267/5242880 weights
  Layer 'model.language_model.layers.20.self_attn.k_proj' (norm: 0.000021): pruned 784454/2621440 weights
  Layer 'model.language_model.layers.20.self_attn.v_proj' (norm: 0.000021): pruned 782157/2621440 weights
  Layer 'model.language_model.layers.20.mlp.gate_proj' (norm: 0.000004): pruned 7860178/26214400 weights
  Layer 'model.language_model.layers.20.mlp.up_proj' (norm: 0.000004): pruned 7845098/26214400 weights
  Layer 'model.language_model.layers.20.mlp.down_proj' (norm: 0.000001): pruned 7811593/26214400 weights
  Layer 'model.language_model.layers.21.self_attn.q_proj' (norm: 0.000006): pruned 1563936/5242880 weights
  Layer 'model.language_model.layers.21.self_attn.k_proj' (norm: 0.000018): pruned 783835/2621440 weights
  Layer 'model.language_model.layers.21.self_attn.v_proj' (norm: 0.000018): pruned 785412/2621440 weights
  Layer 'model.language_model.layers.21.mlp.gate_proj' (norm: 0.000004): pruned 7863094/26214400 weights
  Layer 'model.language_model.layers.21.mlp.up_proj' (norm: 0.000004): pruned 7851024/26214400 weights
  Layer 'model.language_model.layers.21.mlp.down_proj' (norm: 0.000001): pruned 7815706/26214400 weights
  Layer 'model.language_model.layers.22.self_attn.q_proj' (norm: 0.000006): pruned 1565858/5242880 weights
  Layer 'model.language_model.layers.22.self_attn.k_proj' (norm: 0.000017): pruned 783527/2621440 weights
  Layer 'model.language_model.layers.22.self_attn.v_proj' (norm: 0.000017): pruned 786348/2621440 weights
  Layer 'model.language_model.layers.22.mlp.gate_proj' (norm: 0.000004): pruned 7834125/26214400 weights
  Layer 'model.language_model.layers.22.mlp.up_proj' (norm: 0.000004): pruned 7860370/26214400 weights
  Layer 'model.language_model.layers.22.mlp.down_proj' (norm: 0.000002): pruned 7839690/26214400 weights
  Layer 'model.language_model.layers.23.self_attn.q_proj' (norm: 0.000005): pruned 1571452/5242880 weights
  Layer 'model.language_model.layers.23.self_attn.k_proj' (norm: 0.000021): pruned 786103/2621440 weights
  Layer 'model.language_model.layers.23.self_attn.v_proj' (norm: 0.000021): pruned 780970/2621440 weights
  Layer 'model.language_model.layers.23.mlp.gate_proj' (norm: 0.000004): pruned 7858701/26214400 weights
  Layer 'model.language_model.layers.23.mlp.up_proj' (norm: 0.000004): pruned 7858547/26214400 weights
  Layer 'model.language_model.layers.23.mlp.down_proj' (norm: 0.000002): pruned 7861083/26214400 weights
  Layer 'model.language_model.layers.24.self_attn.q_proj' (norm: 0.000007): pruned 1567627/5242880 weights
  Layer 'model.language_model.layers.24.self_attn.k_proj' (norm: 0.000016): pruned 779746/2621440 weights
  Layer 'model.language_model.layers.24.self_attn.v_proj' (norm: 0.000016): pruned 784256/2621440 weights
  Layer 'model.language_model.layers.24.mlp.gate_proj' (norm: 0.000004): pruned 7857603/26214400 weights
  Layer 'model.language_model.layers.24.mlp.up_proj' (norm: 0.000004): pruned 7842073/26214400 weights
  Layer 'model.language_model.layers.24.mlp.down_proj' (norm: 0.000001): pruned 7844167/26214400 weights
  Layer 'model.language_model.layers.25.self_attn.q_proj' (norm: 0.000005): pruned 1567276/5242880 weights
  Layer 'model.language_model.layers.25.self_attn.k_proj' (norm: 0.000021): pruned 786042/2621440 weights
  Layer 'model.language_model.layers.25.self_attn.v_proj' (norm: 0.000021): pruned 785525/2621440 weights
  Layer 'model.language_model.layers.25.mlp.gate_proj' (norm: 0.000004): pruned 7857844/26214400 weights
  Layer 'model.language_model.layers.25.mlp.up_proj' (norm: 0.000004): pruned 7857254/26214400 weights
  Layer 'model.language_model.layers.25.mlp.down_proj' (norm: 0.000001): pruned 7861730/26214400 weights
  Layer 'model.language_model.layers.26.self_attn.q_proj' (norm: 0.000009): pruned 1565223/5242880 weights
  Layer 'model.language_model.layers.26.self_attn.k_proj' (norm: 0.000012): pruned 783258/2621440 weights
  Layer 'model.language_model.layers.26.self_attn.v_proj' (norm: 0.000012): pruned 785882/2621440 weights
  Layer 'model.language_model.layers.26.mlp.gate_proj' (norm: 0.000004): pruned 7853648/26214400 weights
  Layer 'model.language_model.layers.26.mlp.up_proj' (norm: 0.000004): pruned 7848714/26214400 weights
  Layer 'model.language_model.layers.26.mlp.down_proj' (norm: 0.000001): pruned 7840625/26214400 weights
  Layer 'model.language_model.layers.27.self_attn.q_proj' (norm: 0.000008): pruned 1568609/5242880 weights
  Layer 'model.language_model.layers.27.self_attn.k_proj' (norm: 0.000012): pruned 785765/2621440 weights
  Layer 'model.language_model.layers.27.self_attn.v_proj' (norm: 0.000012): pruned 785853/2621440 weights
  Layer 'model.language_model.layers.27.mlp.gate_proj' (norm: 0.000004): pruned 7842429/26214400 weights
  Layer 'model.language_model.layers.27.mlp.up_proj' (norm: 0.000004): pruned 7860882/26214400 weights
  Layer 'model.language_model.layers.27.mlp.down_proj' (norm: 0.000002): pruned 7858802/26214400 weights
  Layer 'model.language_model.layers.28.self_attn.q_proj' (norm: 0.000008): pruned 1569993/5242880 weights
  Layer 'model.language_model.layers.28.self_attn.k_proj' (norm: 0.000013): pruned 785665/2621440 weights
  Layer 'model.language_model.layers.28.self_attn.v_proj' (norm: 0.000013): pruned 786211/2621440 weights
  Layer 'model.language_model.layers.28.mlp.gate_proj' (norm: 0.000004): pruned 7835703/26214400 weights
  Layer 'model.language_model.layers.28.mlp.up_proj' (norm: 0.000004): pruned 7854233/26214400 weights
  Layer 'model.language_model.layers.28.mlp.down_proj' (norm: 0.000001): pruned 7852107/26214400 weights
  Layer 'model.language_model.layers.29.self_attn.q_proj' (norm: 0.000007): pruned 1571016/5242880 weights
  Layer 'model.language_model.layers.29.self_attn.k_proj' (norm: 0.000014): pruned 785119/2621440 weights
  Layer 'model.language_model.layers.29.self_attn.v_proj' (norm: 0.000014): pruned 783906/2621440 weights
  Layer 'model.language_model.layers.29.mlp.gate_proj' (norm: 0.000004): pruned 7857426/26214400 weights
  Layer 'model.language_model.layers.29.mlp.up_proj' (norm: 0.000004): pruned 7836939/26214400 weights
  Layer 'model.language_model.layers.29.mlp.down_proj' (norm: 0.000002): pruned 7844927/26214400 weights
  Layer 'model.language_model.layers.30.self_attn.q_proj' (norm: 0.000005): pruned 1571528/5242880 weights
  Layer 'model.language_model.layers.30.self_attn.k_proj' (norm: 0.000022): pruned 784318/2621440 weights
  Layer 'model.language_model.layers.30.self_attn.v_proj' (norm: 0.000022): pruned 785626/2621440 weights
  Layer 'model.language_model.layers.30.mlp.gate_proj' (norm: 0.000004): pruned 7859493/26214400 weights
  Layer 'model.language_model.layers.30.mlp.up_proj' (norm: 0.000004): pruned 7846403/26214400 weights
  Layer 'model.language_model.layers.30.mlp.down_proj' (norm: 0.000001): pruned 7824233/26214400 weights
  Layer 'model.language_model.layers.31.self_attn.q_proj' (norm: 0.000006): pruned 1566888/5242880 weights
  Layer 'model.language_model.layers.31.self_attn.k_proj' (norm: 0.000019): pruned 785110/2621440 weights
  Layer 'model.language_model.layers.31.self_attn.v_proj' (norm: 0.000019): pruned 786341/2621440 weights
  Layer 'model.language_model.layers.31.mlp.gate_proj' (norm: 0.000004): pruned 7850113/26214400 weights
  Layer 'model.language_model.layers.31.mlp.up_proj' (norm: 0.000004): pruned 7858072/26214400 weights
  Layer 'model.language_model.layers.31.mlp.down_proj' (norm: 0.000002): pruned 7841156/26214400 weights
  Layer 'model.language_model.layers.32.self_attn.q_proj' (norm: 0.000008): pruned 1571296/5242880 weights
  Layer 'model.language_model.layers.32.self_attn.k_proj' (norm: 0.000013): pruned 784218/2621440 weights
  Layer 'model.language_model.layers.32.self_attn.v_proj' (norm: 0.000013): pruned 785463/2621440 weights
  Layer 'model.language_model.layers.32.mlp.gate_proj' (norm: 0.000004): pruned 7849098/26214400 weights
  Layer 'model.language_model.layers.32.mlp.up_proj' (norm: 0.000004): pruned 7863482/26214400 weights
  Layer 'model.language_model.layers.32.mlp.down_proj' (norm: 0.000001): pruned 7845867/26214400 weights
  Layer 'model.language_model.layers.33.self_attn.q_proj' (norm: 0.000008): pruned 1566560/5242880 weights
  Layer 'model.language_model.layers.33.self_attn.k_proj' (norm: 0.000013): pruned 784563/2621440 weights
  Layer 'model.language_model.layers.33.self_attn.v_proj' (norm: 0.000013): pruned 785335/2621440 weights
  Layer 'model.language_model.layers.33.mlp.gate_proj' (norm: 0.000004): pruned 7843060/26214400 weights
  Layer 'model.language_model.layers.33.mlp.up_proj' (norm: 0.000004): pruned 7857227/26214400 weights
  Layer 'model.language_model.layers.33.mlp.down_proj' (norm: 0.000002): pruned 7862674/26214400 weights
  Total pruned: 939228840/3137880064 weights (29.93%)
Overall model sparsity: 21.84%
Successfully applied layer_norm sparsity: 21.84%
Quantization applied successfully
Image loaded successfully from https://images.pexels.com/photos/29380151/pexels-photo-29380151.jpeg
VRAM used: 5.70 GB | Time: 23.14s
Saved output persistently to outputs/output_42_0.txt
Version Details
Version ID
44626bdc478fcfe56ee3d8a5a846b72f1e25abac25f740b2b615c1fcb2b63cb2
Version Created
September 29, 2025
Run on Replicate β†’