paragekbote/gemma3-torchao-quant-sparse π’ππΌοΈ β π
About
An optimized gemma-3-4b setup with INT8 weight-only quantization, torch_compile and sparsity for efficient inference.

Example Output
"Describe the image in the photo. What type of breads are in the image and from what region are they?"
Output
Okay, let's break down the image and the likely types of bread and their potential origins.
Image Description:
The photo showcases a beautiful, almost gallery-like, display of a wide variety of freshly baked breads and pastries. Itβs clearly in a bakery or a high-end food display area, likely in a hotel, restaurant, or a specialty shop. The focus is on the presentation and the quality of the goods.
Here's a breakdown of the types of breads and the possibilities of their origins, based on the image:
Types of Breads (Based on the Visuals):
- Sourdough (Dark, Round Loaves): There are at least two dark, rustic-looking sourdough loaves. These are almost certainly from a European (particularly French or Scandinavian) tradition, known for their tangy flavor and dense texture.
- Baguettes (Long, Thin Loaves): Several long, slender baguettes are present. These are a classic French bread.
- Croissants (Triangle-Shaped Pastries): A large quantity of golden-brown croissants. These are, of course, a classic French pastry.
- Pain au Chocolat (Dark, Layered Pastries): These look like a cross between a croissant and a chocolate-filled pastry, and are a common French/European treat.
- Brioche (Round, Golden Loaves): These appear to be round, golden-brown loaves, likely made with a rich, buttery dough. Brioche is traditionally from France.
- PΓ£o de Queijo (Cheese Rolls): A few of these are visible, and they're a specialty of Brazil.
- Rolls (Various Shapes and Sizes): There are a number of rolls, some with a glaze or a sweet topping.
- Pastries with Fruit (Red and Pink): A display of what appear to be small, fruit-filled pastries.
Possible Regional Origins (Based on the Bread Variety):
- French Influence (Strong): The high presence of baguettes, croissants, and pain au chocolat strongly suggests a French influence. This is very common in many high-quality bakeries.
- Scandinavian/European Sourdough: The dark, rustic sourdoughs likely point to a Scandinavian or European tradition, where these types of bread are
Performance Metrics
All Input Parameters
{ "seed": 42, "top_p": 0.9, "prompt": "Describe the image in the photo. What type of breads are in the image and from what region are they?", "image_url": "https://images.pexels.com/photos/29380151/pexels-photo-29380151.jpeg", "temperature": 0.7, "use_sparsity": "true", "sparsity_type": "layer_norm", "max_new_tokens": 500, "sparsity_ratio": 0.3, "use_quantization": "true" }
Input Parameters
- seed
- Seed for reproducibility
- top_p
- Top-p nucleus sampling
- prompt (required)
- Input text prompt
- image_url
- Optional image URL
- temperature
- Sampling temperature
- use_sparsity
- Enable sparsity optimization
- sparsity_type
- Type of sparsity: magnitude, gradual, layer_norm
- max_new_tokens
- Maximum number of new tokens
- sparsity_ratio
- use_quantization
- Enable INT8 quantization using torchao
Output Schema
Output
Example Execution Logs
Applying layer_norm sparsity with ratio 0.3 Analyzing weight norms (fast method)... Analyzed 285 layers Layer 'model.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj' (norm: 0.000027): pruned 396874/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj' (norm: 0.000012): pruned 396627/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj' (norm: 0.000022): pruned 397385/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj' (norm: 0.000021): pruned 397761/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj' (norm: 0.000014): pruned 398029/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj' (norm: 0.000019): pruned 397758/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj' (norm: 0.000021): pruned 397268/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj' (norm: 0.000016): pruned 397612/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj' (norm: 0.000021): pruned 397838/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj' (norm: 0.000020): pruned 397704/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj' (norm: 0.000018): pruned 397917/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj' (norm: 0.000020): pruned 397138/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj' (norm: 0.000020): pruned 397412/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj' (norm: 0.000018): pruned 398095/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj' (norm: 0.000020): pruned 397723/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj' (norm: 0.000020): pruned 396857/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj' (norm: 0.000018): pruned 397651/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj' (norm: 0.000020): pruned 396869/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj' (norm: 0.000019): pruned 396330/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj' (norm: 0.000018): pruned 396762/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj' (norm: 0.000019): pruned 396019/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj' (norm: 0.000019): pruned 397176/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj' (norm: 0.000018): pruned 397401/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj' (norm: 0.000019): pruned 396086/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj' (norm: 0.000019): pruned 397019/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj' (norm: 0.000019): pruned 398051/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj' (norm: 0.000020): pruned 395838/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj' (norm: 0.000020): pruned 395942/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj' (norm: 0.000018): pruned 396987/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj' (norm: 0.000020): pruned 397658/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj' (norm: 0.000020): pruned 396190/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj' (norm: 0.000018): pruned 397278/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj' (norm: 0.000020): pruned 397786/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj' (norm: 0.000019): pruned 395439/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj' (norm: 0.000018): pruned 396925/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj' (norm: 0.000020): pruned 397273/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj' (norm: 0.000019): pruned 398077/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj' (norm: 0.000018): pruned 397021/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj' (norm: 0.000020): pruned 396861/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj' (norm: 0.000020): pruned 397093/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj' (norm: 0.000018): pruned 397462/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj' (norm: 0.000020): pruned 397713/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj' (norm: 0.000020): pruned 397829/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj' (norm: 0.000018): pruned 397377/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj' (norm: 0.000019): pruned 396923/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj' (norm: 0.000020): pruned 395790/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj' (norm: 0.000018): pruned 398112/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj' (norm: 0.000020): pruned 398098/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj' (norm: 0.000019): pruned 397681/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj' (norm: 0.000018): pruned 398104/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj' (norm: 0.000019): pruned 396414/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj' (norm: 0.000019): pruned 397537/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj' (norm: 0.000019): pruned 396238/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj' (norm: 0.000019): pruned 395986/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj' (norm: 0.000019): pruned 396103/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj' (norm: 0.000019): pruned 395889/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj' (norm: 0.000019): pruned 397102/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj' (norm: 0.000019): pruned 396990/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj' (norm: 0.000019): pruned 395831/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj' (norm: 0.000019): pruned 396243/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj' (norm: 0.000019): pruned 395576/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj' (norm: 0.000019): pruned 396755/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj' (norm: 0.000019): pruned 396884/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj' (norm: 0.000019): pruned 396582/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj' (norm: 0.000019): pruned 395829/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj' (norm: 0.000019): pruned 397988/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj' (norm: 0.000019): pruned 396898/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj' (norm: 0.000020): pruned 395671/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj' (norm: 0.000019): pruned 395743/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj' (norm: 0.000019): pruned 396370/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj' (norm: 0.000020): pruned 395492/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj' (norm: 0.000019): pruned 396073/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.24.self_attn.k_proj' (norm: 0.000019): pruned 395655/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.24.self_attn.v_proj' (norm: 0.000021): pruned 397597/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.24.self_attn.q_proj' (norm: 0.000019): pruned 397599/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.25.self_attn.k_proj' (norm: 0.000020): pruned 396702/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.25.self_attn.v_proj' (norm: 0.000021): pruned 397998/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.25.self_attn.q_proj' (norm: 0.000019): pruned 397291/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj' (norm: 0.000022): pruned 396027/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj' (norm: 0.000021): pruned 397592/1327104 weights Layer 'model.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj' (norm: 0.000019): pruned 396151/1327104 weights Layer 'model.language_model.layers.0.self_attn.q_proj' (norm: 0.000008): pruned 1566745/5242880 weights Layer 'model.language_model.layers.0.self_attn.k_proj' (norm: 0.000014): pruned 784421/2621440 weights Layer 'model.language_model.layers.0.self_attn.v_proj' (norm: 0.000014): pruned 782653/2621440 weights Layer 'model.language_model.layers.0.mlp.gate_proj' (norm: 0.000004): pruned 7858057/26214400 weights Layer 'model.language_model.layers.0.mlp.up_proj' (norm: 0.000004): pruned 7863841/26214400 weights Layer 'model.language_model.layers.0.mlp.down_proj' (norm: 0.000001): pruned 7848683/26214400 weights Layer 'model.language_model.layers.1.self_attn.q_proj' (norm: 0.000005): pruned 1563891/5242880 weights Layer 'model.language_model.layers.1.self_attn.k_proj' (norm: 0.000020): pruned 784431/2621440 weights Layer 'model.language_model.layers.1.self_attn.v_proj' (norm: 0.000020): pruned 781005/2621440 weights Layer 'model.language_model.layers.1.mlp.gate_proj' (norm: 0.000004): pruned 7849222/26214400 weights Layer 'model.language_model.layers.1.mlp.up_proj' (norm: 0.000004): pruned 7851394/26214400 weights Layer 'model.language_model.layers.1.mlp.down_proj' (norm: 0.000001): pruned 7812359/26214400 weights Layer 'model.language_model.layers.2.self_attn.q_proj' (norm: 0.000006): pruned 1564813/5242880 weights Layer 'model.language_model.layers.2.self_attn.k_proj' (norm: 0.000018): pruned 783526/2621440 weights Layer 'model.language_model.layers.2.self_attn.v_proj' (norm: 0.000018): pruned 783793/2621440 weights Layer 'model.language_model.layers.2.mlp.gate_proj' (norm: 0.000004): pruned 7846299/26214400 weights Layer 'model.language_model.layers.2.mlp.up_proj' (norm: 0.000004): pruned 7864104/26214400 weights Layer 'model.language_model.layers.2.mlp.down_proj' (norm: 0.000001): pruned 7819838/26214400 weights Layer 'model.language_model.layers.3.self_attn.q_proj' (norm: 0.000005): pruned 1565343/5242880 weights Layer 'model.language_model.layers.3.self_attn.k_proj' (norm: 0.000020): pruned 783362/2621440 weights Layer 'model.language_model.layers.3.self_attn.v_proj' (norm: 0.000020): pruned 785068/2621440 weights Layer 'model.language_model.layers.3.mlp.gate_proj' (norm: 0.000004): pruned 7853632/26214400 weights Layer 'model.language_model.layers.3.mlp.up_proj' (norm: 0.000004): pruned 7837870/26214400 weights Layer 'model.language_model.layers.3.mlp.down_proj' (norm: 0.000002): pruned 7843756/26214400 weights Layer 'model.language_model.layers.4.self_attn.q_proj' (norm: 0.000006): pruned 1567364/5242880 weights Layer 'model.language_model.layers.4.self_attn.k_proj' (norm: 0.000016): pruned 785838/2621440 weights Layer 'model.language_model.layers.4.self_attn.v_proj' (norm: 0.000016): pruned 782560/2621440 weights Layer 'model.language_model.layers.4.mlp.gate_proj' (norm: 0.000004): pruned 7851182/26214400 weights Layer 'model.language_model.layers.4.mlp.up_proj' (norm: 0.000004): pruned 7838791/26214400 weights Layer 'model.language_model.layers.4.mlp.down_proj' (norm: 0.000001): pruned 7808281/26214400 weights Layer 'model.language_model.layers.5.self_attn.q_proj' (norm: 0.000004): pruned 1567668/5242880 weights Layer 'model.language_model.layers.5.self_attn.k_proj' (norm: 0.000024): pruned 784539/2621440 weights Layer 'model.language_model.layers.5.self_attn.v_proj' (norm: 0.000024): pruned 784890/2621440 weights Layer 'model.language_model.layers.5.mlp.gate_proj' (norm: 0.000004): pruned 7861828/26214400 weights Layer 'model.language_model.layers.5.mlp.up_proj' (norm: 0.000004): pruned 7852033/26214400 weights Layer 'model.language_model.layers.5.mlp.down_proj' (norm: 0.000001): pruned 7838535/26214400 weights Layer 'model.language_model.layers.6.self_attn.q_proj' (norm: 0.000005): pruned 1569561/5242880 weights Layer 'model.language_model.layers.6.self_attn.k_proj' (norm: 0.000020): pruned 783838/2621440 weights Layer 'model.language_model.layers.6.self_attn.v_proj' (norm: 0.000020): pruned 783498/2621440 weights Layer 'model.language_model.layers.6.mlp.gate_proj' (norm: 0.000004): pruned 7862780/26214400 weights Layer 'model.language_model.layers.6.mlp.up_proj' (norm: 0.000004): pruned 7846276/26214400 weights Layer 'model.language_model.layers.6.mlp.down_proj' (norm: 0.000002): pruned 7845041/26214400 weights Layer 'model.language_model.layers.7.self_attn.q_proj' (norm: 0.000008): pruned 1563312/5242880 weights Layer 'model.language_model.layers.7.self_attn.k_proj' (norm: 0.000013): pruned 784205/2621440 weights Layer 'model.language_model.layers.7.self_attn.v_proj' (norm: 0.000013): pruned 783809/2621440 weights Layer 'model.language_model.layers.7.mlp.gate_proj' (norm: 0.000004): pruned 7834968/26214400 weights Layer 'model.language_model.layers.7.mlp.up_proj' (norm: 0.000004): pruned 7859615/26214400 weights Layer 'model.language_model.layers.7.mlp.down_proj' (norm: 0.000002): pruned 7836583/26214400 weights Layer 'model.language_model.layers.8.self_attn.q_proj' (norm: 0.000008): pruned 1568054/5242880 weights Layer 'model.language_model.layers.8.self_attn.k_proj' (norm: 0.000013): pruned 785089/2621440 weights Layer 'model.language_model.layers.8.self_attn.v_proj' (norm: 0.000013): pruned 781716/2621440 weights Layer 'model.language_model.layers.8.mlp.gate_proj' (norm: 0.000004): pruned 7839262/26214400 weights Layer 'model.language_model.layers.8.mlp.up_proj' (norm: 0.000004): pruned 7863891/26214400 weights Layer 'model.language_model.layers.8.mlp.down_proj' (norm: 0.000002): pruned 7847451/26214400 weights Layer 'model.language_model.layers.9.self_attn.q_proj' (norm: 0.000005): pruned 1562532/5242880 weights Layer 'model.language_model.layers.9.self_attn.k_proj' (norm: 0.000020): pruned 783495/2621440 weights Layer 'model.language_model.layers.9.self_attn.v_proj' (norm: 0.000020): pruned 784156/2621440 weights Layer 'model.language_model.layers.9.mlp.gate_proj' (norm: 0.000004): pruned 7839204/26214400 weights Layer 'model.language_model.layers.9.mlp.up_proj' (norm: 0.000004): pruned 7850233/26214400 weights Layer 'model.language_model.layers.9.mlp.down_proj' (norm: 0.000002): pruned 7834925/26214400 weights Layer 'model.language_model.layers.10.self_attn.q_proj' (norm: 0.000008): pruned 1566576/5242880 weights Layer 'model.language_model.layers.10.self_attn.k_proj' (norm: 0.000014): pruned 784979/2621440 weights Layer 'model.language_model.layers.10.self_attn.v_proj' (norm: 0.000014): pruned 785066/2621440 weights Layer 'model.language_model.layers.10.mlp.gate_proj' (norm: 0.000004): pruned 7842217/26214400 weights Layer 'model.language_model.layers.10.mlp.up_proj' (norm: 0.000004): pruned 7848674/26214400 weights Layer 'model.language_model.layers.10.mlp.down_proj' (norm: 0.000001): pruned 7858524/26214400 weights Layer 'model.language_model.layers.11.self_attn.q_proj' (norm: 0.000005): pruned 1570390/5242880 weights Layer 'model.language_model.layers.11.self_attn.k_proj' (norm: 0.000021): pruned 784730/2621440 weights Layer 'model.language_model.layers.11.self_attn.v_proj' (norm: 0.000021): pruned 784219/2621440 weights Layer 'model.language_model.layers.11.mlp.gate_proj' (norm: 0.000004): pruned 7863042/26214400 weights Layer 'model.language_model.layers.11.mlp.up_proj' (norm: 0.000004): pruned 7860783/26214400 weights Layer 'model.language_model.layers.11.mlp.down_proj' (norm: 0.000001): pruned 7843536/26214400 weights Layer 'model.language_model.layers.12.self_attn.q_proj' (norm: 0.000006): pruned 1566366/5242880 weights Layer 'model.language_model.layers.12.self_attn.k_proj' (norm: 0.000018): pruned 780555/2621440 weights Layer 'model.language_model.layers.12.self_attn.v_proj' (norm: 0.000018): pruned 783562/2621440 weights Layer 'model.language_model.layers.12.mlp.gate_proj' (norm: 0.000004): pruned 7855918/26214400 weights Layer 'model.language_model.layers.12.mlp.up_proj' (norm: 0.000004): pruned 7839485/26214400 weights Layer 'model.language_model.layers.12.mlp.down_proj' (norm: 0.000002): pruned 7863318/26214400 weights Layer 'model.language_model.layers.13.self_attn.q_proj' (norm: 0.000005): pruned 1566527/5242880 weights Layer 'model.language_model.layers.13.self_attn.k_proj' (norm: 0.000023): pruned 783081/2621440 weights Layer 'model.language_model.layers.13.self_attn.v_proj' (norm: 0.000023): pruned 784072/2621440 weights Layer 'model.language_model.layers.13.mlp.gate_proj' (norm: 0.000004): pruned 7861441/26214400 weights Layer 'model.language_model.layers.13.mlp.up_proj' (norm: 0.000004): pruned 7858031/26214400 weights Layer 'model.language_model.layers.13.mlp.down_proj' (norm: 0.000002): pruned 7837261/26214400 weights Layer 'model.language_model.layers.14.self_attn.q_proj' (norm: 0.000005): pruned 1568322/5242880 weights Layer 'model.language_model.layers.14.self_attn.k_proj' (norm: 0.000020): pruned 786229/2621440 weights Layer 'model.language_model.layers.14.self_attn.v_proj' (norm: 0.000020): pruned 781660/2621440 weights Layer 'model.language_model.layers.14.mlp.gate_proj' (norm: 0.000004): pruned 7832633/26214400 weights Layer 'model.language_model.layers.14.mlp.up_proj' (norm: 0.000004): pruned 7849372/26214400 weights Layer 'model.language_model.layers.14.mlp.down_proj' (norm: 0.000002): pruned 7855301/26214400 weights Layer 'model.language_model.layers.15.self_attn.q_proj' (norm: 0.000005): pruned 1572146/5242880 weights Layer 'model.language_model.layers.15.self_attn.k_proj' (norm: 0.000023): pruned 786033/2621440 weights Layer 'model.language_model.layers.15.self_attn.v_proj' (norm: 0.000023): pruned 780821/2621440 weights Layer 'model.language_model.layers.15.mlp.gate_proj' (norm: 0.000004): pruned 7835666/26214400 weights Layer 'model.language_model.layers.15.mlp.up_proj' (norm: 0.000004): pruned 7843451/26214400 weights Layer 'model.language_model.layers.15.mlp.down_proj' (norm: 0.000001): pruned 7834040/26214400 weights Layer 'model.language_model.layers.16.self_attn.q_proj' (norm: 0.000006): pruned 1569784/5242880 weights Layer 'model.language_model.layers.16.self_attn.k_proj' (norm: 0.000017): pruned 786382/2621440 weights Layer 'model.language_model.layers.16.self_attn.v_proj' (norm: 0.000017): pruned 783502/2621440 weights Layer 'model.language_model.layers.16.mlp.gate_proj' (norm: 0.000004): pruned 7859682/26214400 weights Layer 'model.language_model.layers.16.mlp.up_proj' (norm: 0.000004): pruned 7861245/26214400 weights Layer 'model.language_model.layers.16.mlp.down_proj' (norm: 0.000002): pruned 7838314/26214400 weights Layer 'model.language_model.layers.17.self_attn.q_proj' (norm: 0.000008): pruned 1565697/5242880 weights Layer 'model.language_model.layers.17.self_attn.k_proj' (norm: 0.000013): pruned 783322/2621440 weights Layer 'model.language_model.layers.17.self_attn.v_proj' (norm: 0.000013): pruned 786170/2621440 weights Layer 'model.language_model.layers.17.mlp.gate_proj' (norm: 0.000004): pruned 7843671/26214400 weights Layer 'model.language_model.layers.17.mlp.up_proj' (norm: 0.000004): pruned 7845283/26214400 weights Layer 'model.language_model.layers.17.mlp.down_proj' (norm: 0.000002): pruned 7822550/26214400 weights Layer 'model.language_model.layers.18.self_attn.q_proj' (norm: 0.000007): pruned 1570967/5242880 weights Layer 'model.language_model.layers.18.self_attn.k_proj' (norm: 0.000016): pruned 780510/2621440 weights Layer 'model.language_model.layers.18.self_attn.v_proj' (norm: 0.000016): pruned 783913/2621440 weights Layer 'model.language_model.layers.18.mlp.gate_proj' (norm: 0.000004): pruned 7835222/26214400 weights Layer 'model.language_model.layers.18.mlp.up_proj' (norm: 0.000004): pruned 7860035/26214400 weights Layer 'model.language_model.layers.18.mlp.down_proj' (norm: 0.000001): pruned 7849580/26214400 weights Layer 'model.language_model.layers.19.self_attn.q_proj' (norm: 0.000008): pruned 1571370/5242880 weights Layer 'model.language_model.layers.19.self_attn.k_proj' (norm: 0.000014): pruned 785753/2621440 weights Layer 'model.language_model.layers.19.self_attn.v_proj' (norm: 0.000014): pruned 784999/2621440 weights Layer 'model.language_model.layers.19.mlp.gate_proj' (norm: 0.000004): pruned 7861778/26214400 weights Layer 'model.language_model.layers.19.mlp.up_proj' (norm: 0.000004): pruned 7840528/26214400 weights Layer 'model.language_model.layers.19.mlp.down_proj' (norm: 0.000001): pruned 7817092/26214400 weights Layer 'model.language_model.layers.20.self_attn.q_proj' (norm: 0.000005): pruned 1569267/5242880 weights Layer 'model.language_model.layers.20.self_attn.k_proj' (norm: 0.000021): pruned 784454/2621440 weights Layer 'model.language_model.layers.20.self_attn.v_proj' (norm: 0.000021): pruned 782157/2621440 weights Layer 'model.language_model.layers.20.mlp.gate_proj' (norm: 0.000004): pruned 7860178/26214400 weights Layer 'model.language_model.layers.20.mlp.up_proj' (norm: 0.000004): pruned 7845098/26214400 weights Layer 'model.language_model.layers.20.mlp.down_proj' (norm: 0.000001): pruned 7811593/26214400 weights Layer 'model.language_model.layers.21.self_attn.q_proj' (norm: 0.000006): pruned 1563936/5242880 weights Layer 'model.language_model.layers.21.self_attn.k_proj' (norm: 0.000018): pruned 783835/2621440 weights Layer 'model.language_model.layers.21.self_attn.v_proj' (norm: 0.000018): pruned 785412/2621440 weights Layer 'model.language_model.layers.21.mlp.gate_proj' (norm: 0.000004): pruned 7863094/26214400 weights Layer 'model.language_model.layers.21.mlp.up_proj' (norm: 0.000004): pruned 7851024/26214400 weights Layer 'model.language_model.layers.21.mlp.down_proj' (norm: 0.000001): pruned 7815706/26214400 weights Layer 'model.language_model.layers.22.self_attn.q_proj' (norm: 0.000006): pruned 1565858/5242880 weights Layer 'model.language_model.layers.22.self_attn.k_proj' (norm: 0.000017): pruned 783527/2621440 weights Layer 'model.language_model.layers.22.self_attn.v_proj' (norm: 0.000017): pruned 786348/2621440 weights Layer 'model.language_model.layers.22.mlp.gate_proj' (norm: 0.000004): pruned 7834125/26214400 weights Layer 'model.language_model.layers.22.mlp.up_proj' (norm: 0.000004): pruned 7860370/26214400 weights Layer 'model.language_model.layers.22.mlp.down_proj' (norm: 0.000002): pruned 7839690/26214400 weights Layer 'model.language_model.layers.23.self_attn.q_proj' (norm: 0.000005): pruned 1571452/5242880 weights Layer 'model.language_model.layers.23.self_attn.k_proj' (norm: 0.000021): pruned 786103/2621440 weights Layer 'model.language_model.layers.23.self_attn.v_proj' (norm: 0.000021): pruned 780970/2621440 weights Layer 'model.language_model.layers.23.mlp.gate_proj' (norm: 0.000004): pruned 7858701/26214400 weights Layer 'model.language_model.layers.23.mlp.up_proj' (norm: 0.000004): pruned 7858547/26214400 weights Layer 'model.language_model.layers.23.mlp.down_proj' (norm: 0.000002): pruned 7861083/26214400 weights Layer 'model.language_model.layers.24.self_attn.q_proj' (norm: 0.000007): pruned 1567627/5242880 weights Layer 'model.language_model.layers.24.self_attn.k_proj' (norm: 0.000016): pruned 779746/2621440 weights Layer 'model.language_model.layers.24.self_attn.v_proj' (norm: 0.000016): pruned 784256/2621440 weights Layer 'model.language_model.layers.24.mlp.gate_proj' (norm: 0.000004): pruned 7857603/26214400 weights Layer 'model.language_model.layers.24.mlp.up_proj' (norm: 0.000004): pruned 7842073/26214400 weights Layer 'model.language_model.layers.24.mlp.down_proj' (norm: 0.000001): pruned 7844167/26214400 weights Layer 'model.language_model.layers.25.self_attn.q_proj' (norm: 0.000005): pruned 1567276/5242880 weights Layer 'model.language_model.layers.25.self_attn.k_proj' (norm: 0.000021): pruned 786042/2621440 weights Layer 'model.language_model.layers.25.self_attn.v_proj' (norm: 0.000021): pruned 785525/2621440 weights Layer 'model.language_model.layers.25.mlp.gate_proj' (norm: 0.000004): pruned 7857844/26214400 weights Layer 'model.language_model.layers.25.mlp.up_proj' (norm: 0.000004): pruned 7857254/26214400 weights Layer 'model.language_model.layers.25.mlp.down_proj' (norm: 0.000001): pruned 7861730/26214400 weights Layer 'model.language_model.layers.26.self_attn.q_proj' (norm: 0.000009): pruned 1565223/5242880 weights Layer 'model.language_model.layers.26.self_attn.k_proj' (norm: 0.000012): pruned 783258/2621440 weights Layer 'model.language_model.layers.26.self_attn.v_proj' (norm: 0.000012): pruned 785882/2621440 weights Layer 'model.language_model.layers.26.mlp.gate_proj' (norm: 0.000004): pruned 7853648/26214400 weights Layer 'model.language_model.layers.26.mlp.up_proj' (norm: 0.000004): pruned 7848714/26214400 weights Layer 'model.language_model.layers.26.mlp.down_proj' (norm: 0.000001): pruned 7840625/26214400 weights Layer 'model.language_model.layers.27.self_attn.q_proj' (norm: 0.000008): pruned 1568609/5242880 weights Layer 'model.language_model.layers.27.self_attn.k_proj' (norm: 0.000012): pruned 785765/2621440 weights Layer 'model.language_model.layers.27.self_attn.v_proj' (norm: 0.000012): pruned 785853/2621440 weights Layer 'model.language_model.layers.27.mlp.gate_proj' (norm: 0.000004): pruned 7842429/26214400 weights Layer 'model.language_model.layers.27.mlp.up_proj' (norm: 0.000004): pruned 7860882/26214400 weights Layer 'model.language_model.layers.27.mlp.down_proj' (norm: 0.000002): pruned 7858802/26214400 weights Layer 'model.language_model.layers.28.self_attn.q_proj' (norm: 0.000008): pruned 1569993/5242880 weights Layer 'model.language_model.layers.28.self_attn.k_proj' (norm: 0.000013): pruned 785665/2621440 weights Layer 'model.language_model.layers.28.self_attn.v_proj' (norm: 0.000013): pruned 786211/2621440 weights Layer 'model.language_model.layers.28.mlp.gate_proj' (norm: 0.000004): pruned 7835703/26214400 weights Layer 'model.language_model.layers.28.mlp.up_proj' (norm: 0.000004): pruned 7854233/26214400 weights Layer 'model.language_model.layers.28.mlp.down_proj' (norm: 0.000001): pruned 7852107/26214400 weights Layer 'model.language_model.layers.29.self_attn.q_proj' (norm: 0.000007): pruned 1571016/5242880 weights Layer 'model.language_model.layers.29.self_attn.k_proj' (norm: 0.000014): pruned 785119/2621440 weights Layer 'model.language_model.layers.29.self_attn.v_proj' (norm: 0.000014): pruned 783906/2621440 weights Layer 'model.language_model.layers.29.mlp.gate_proj' (norm: 0.000004): pruned 7857426/26214400 weights Layer 'model.language_model.layers.29.mlp.up_proj' (norm: 0.000004): pruned 7836939/26214400 weights Layer 'model.language_model.layers.29.mlp.down_proj' (norm: 0.000002): pruned 7844927/26214400 weights Layer 'model.language_model.layers.30.self_attn.q_proj' (norm: 0.000005): pruned 1571528/5242880 weights Layer 'model.language_model.layers.30.self_attn.k_proj' (norm: 0.000022): pruned 784318/2621440 weights Layer 'model.language_model.layers.30.self_attn.v_proj' (norm: 0.000022): pruned 785626/2621440 weights Layer 'model.language_model.layers.30.mlp.gate_proj' (norm: 0.000004): pruned 7859493/26214400 weights Layer 'model.language_model.layers.30.mlp.up_proj' (norm: 0.000004): pruned 7846403/26214400 weights Layer 'model.language_model.layers.30.mlp.down_proj' (norm: 0.000001): pruned 7824233/26214400 weights Layer 'model.language_model.layers.31.self_attn.q_proj' (norm: 0.000006): pruned 1566888/5242880 weights Layer 'model.language_model.layers.31.self_attn.k_proj' (norm: 0.000019): pruned 785110/2621440 weights Layer 'model.language_model.layers.31.self_attn.v_proj' (norm: 0.000019): pruned 786341/2621440 weights Layer 'model.language_model.layers.31.mlp.gate_proj' (norm: 0.000004): pruned 7850113/26214400 weights Layer 'model.language_model.layers.31.mlp.up_proj' (norm: 0.000004): pruned 7858072/26214400 weights Layer 'model.language_model.layers.31.mlp.down_proj' (norm: 0.000002): pruned 7841156/26214400 weights Layer 'model.language_model.layers.32.self_attn.q_proj' (norm: 0.000008): pruned 1571296/5242880 weights Layer 'model.language_model.layers.32.self_attn.k_proj' (norm: 0.000013): pruned 784218/2621440 weights Layer 'model.language_model.layers.32.self_attn.v_proj' (norm: 0.000013): pruned 785463/2621440 weights Layer 'model.language_model.layers.32.mlp.gate_proj' (norm: 0.000004): pruned 7849098/26214400 weights Layer 'model.language_model.layers.32.mlp.up_proj' (norm: 0.000004): pruned 7863482/26214400 weights Layer 'model.language_model.layers.32.mlp.down_proj' (norm: 0.000001): pruned 7845867/26214400 weights Layer 'model.language_model.layers.33.self_attn.q_proj' (norm: 0.000008): pruned 1566560/5242880 weights Layer 'model.language_model.layers.33.self_attn.k_proj' (norm: 0.000013): pruned 784563/2621440 weights Layer 'model.language_model.layers.33.self_attn.v_proj' (norm: 0.000013): pruned 785335/2621440 weights Layer 'model.language_model.layers.33.mlp.gate_proj' (norm: 0.000004): pruned 7843060/26214400 weights Layer 'model.language_model.layers.33.mlp.up_proj' (norm: 0.000004): pruned 7857227/26214400 weights Layer 'model.language_model.layers.33.mlp.down_proj' (norm: 0.000002): pruned 7862674/26214400 weights Total pruned: 939228840/3137880064 weights (29.93%) Overall model sparsity: 21.84% Successfully applied layer_norm sparsity: 21.84% Quantization applied successfully Image loaded successfully from https://images.pexels.com/photos/29380151/pexels-photo-29380151.jpeg VRAM used: 5.70 GB | Time: 23.14s Saved output persistently to outputs/output_42_0.txt
Version Details
- Version ID
44626bdc478fcfe56ee3d8a5a846b72f1e25abac25f740b2b615c1fcb2b63cb2
- Version Created
- September 29, 2025