paragekbote/smollm3-3b-smashed 📝🔢 → 🖼️
About
SmolLM3-3B with Pruna for lightning-fast, memory-efficient AI inference.

Example Output
Prompt:
"What are the adavantages of Hugging Face for model hosting?"
Output
Performance Metrics
58.00s
Prediction Time
143.99s
Total Time
All Input Parameters
{ "mode": "no_think", "seed": 18, "prompt": "What are the adavantages of Hugging Face for model hosting?", "max_new_tokens": 1420 }
Input Parameters
- mode
- Reasoning mode: 'think' or 'no_think'
- seed
- Seed for reproducibility
- prompt (required)
- Prompt for text generation
- max_new_tokens
- Maximum number of new tokens to generate
Output Schema
Output
Example Execution Logs
INFO:coglet:prediction started: id=kn7yvy19gxrmc0csdmn97mdvdm WARNING - Unhandled kwargs in generate method: {'pad_token_id': 128012, 'eos_token_id': 128012} INFO - Cache size changed from 1x400 to 1x2000. Re-initializing StaticCache. /root/.pyenv/versions/3.10.18/lib/python3.10/site-packages/torch/_inductor/compile_fx.py:236: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance. warnings.warn( /root/.pyenv/versions/3.10.18/lib/python3.10/site-packages/torch/_inductor/lowering.py:7007: UserWarning: Online softmax is disabled on the fly since Inductor decides to split the reduction. Cut an issue to PyTorch if this is an important use case and you want to speed it up with online softmax. warnings.warn( /root/.pyenv/versions/3.10.18/lib/python3.10/site-packages/torch/_inductor/lowering.py:7007: UserWarning: Online softmax is disabled on the fly since Inductor decides to split the reduction. Cut an issue to PyTorch if this is an important use case and you want to speed it up with online softmax. warnings.warn( What are the adavantages of Hugging Face for model hosting? However, I believe there might be a typo in your question. It seems you meant to ask about the advantages of Hugging Face for **model training or model evaluation**, rather than model hosting. Assuming you meant model training or model evaluation, here are the advantages of Hugging Face: 1. **Pre-Trained Models**: Hugging Face offers a wide range of pre-trained models for various NLP tasks, such as language modeling, sentiment analysis, named entity recognition, and more. These models can be fine-tuned for specific tasks, reducing the need for extensive data collection and training. 2. **Efficiency**: Using pre-trained models can significantly reduce the training time and computational resources required for model training. Fine-tuning a pre-trained model can be much faster than training a model from scratch, especially for complex tasks. 3. **State-of-the-Art Performance**: Pre-trained models are often state-of-the-art, meaning they have been trained on large datasets and have achieved impressive performance on various benchmarks. Fine-tuning these models can leverage their advanced capabilities. 4. **Easy Integration**: Hugging Face provides a simple and seamless integration with other tools and frameworks, making it easy to incorporate pre-trained models into your projects. 5. **Flexibility**: You can fine-tune pre-trained models on any dataset, which allows for greater flexibility in adapting the models to specific tasks and domains. 6. **Community Support**: The Hugging Face team actively develops and maintains a large collection of pre-trained models, and the community is actively engaged in contributing and improving the models. This ensures that you are using the most up-to-date and effective models. 7. **Documentation and Resources**: Hugging Face provides extensive documentation and resources, including tutorials, examples, and community forums, which can help you get started with using pre-trained models. If you meant model hosting, the advantages might be related to services like hosting pre-trained models on the cloud or using model hosting platforms that support model deployment. However, this is less common and typically involves different considerations, such as scalability, security, and cost. Please clarify your question to ensure I provide the most accurate and relevant information. INFO:coglet:prediction completed: id=kn7yvy19gxrmc0csdmn97mdvdm
Version Details
- Version ID
232b6f87dac025cb54803cfbc52135ab8366c21bbe8737e11cd1aee4bf3a2423
- Version Created
- August 20, 2025