paragekbote/smollm3-3b-smashed 📝🔢 → 🖼️

▶️ 15 runs 📅 Aug 2025 ⚙️ Cog 0.16.3

About

SmolLM3-3B with Pruna for lightning-fast, memory-efficient AI inference.

Example Output

Prompt:

"What are the adavantages of Hugging Face for model hosting?"

Output

Performance Metrics

58.00s Prediction Time

143.99s Total Time

All Input Parameters

{
  "mode": "no_think",
  "seed": 18,
  "prompt": "What are the adavantages of Hugging Face for model hosting?",
  "max_new_tokens": 1420
}

Input Parameters

mode Type: stringDefault: no_think: Reasoning mode: 'think' or 'no_think'
seed Type: integerDefault: -1: Seed for reproducibility
prompt (required) Type: string: Prompt for text generation
max_new_tokens Type: integerDefault: 512Range: 1 - 16384: Maximum number of new tokens to generate

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

INFO:coglet:prediction started: id=kn7yvy19gxrmc0csdmn97mdvdm
WARNING - Unhandled kwargs in generate method: {'pad_token_id': 128012, 'eos_token_id': 128012}
INFO - Cache size changed from 1x400 to 1x2000. Re-initializing StaticCache.
/root/.pyenv/versions/3.10.18/lib/python3.10/site-packages/torch/_inductor/compile_fx.py:236: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
warnings.warn(
/root/.pyenv/versions/3.10.18/lib/python3.10/site-packages/torch/_inductor/lowering.py:7007: UserWarning:
Online softmax is disabled on the fly since Inductor decides to
split the reduction. Cut an issue to PyTorch if this is an
important use case and you want to speed it up with online
softmax.

warnings.warn(
/root/.pyenv/versions/3.10.18/lib/python3.10/site-packages/torch/_inductor/lowering.py:7007: UserWarning:
Online softmax is disabled on the fly since Inductor decides to
split the reduction. Cut an issue to PyTorch if this is an
important use case and you want to speed it up with online
softmax.

warnings.warn(
What are the adavantages of Hugging Face for model hosting?

However, I believe there might be a typo in your question. It seems you meant to ask about the advantages of Hugging Face for **model training or model evaluation**, rather than model hosting.

Assuming you meant model training or model evaluation, here are the advantages of Hugging Face:

1. **Pre-Trained Models**: Hugging Face offers a wide range of pre-trained models for various NLP tasks, such as language modeling, sentiment analysis, named entity recognition, and more. These models can be fine-tuned for specific tasks, reducing the need for extensive data collection and training.

2. **Efficiency**: Using pre-trained models can significantly reduce the training time and computational resources required for model training. Fine-tuning a pre-trained model can be much faster than training a model from scratch, especially for complex tasks.

3. **State-of-the-Art Performance**: Pre-trained models are often state-of-the-art, meaning they have been trained on large datasets and have achieved impressive performance on various benchmarks. Fine-tuning these models can leverage their advanced capabilities.

4. **Easy Integration**: Hugging Face provides a simple and seamless integration with other tools and frameworks, making it easy to incorporate pre-trained models into your projects.

5. **Flexibility**: You can fine-tune pre-trained models on any dataset, which allows for greater flexibility in adapting the models to specific tasks and domains.

6. **Community Support**: The Hugging Face team actively develops and maintains a large collection of pre-trained models, and the community is actively engaged in contributing and improving the models. This ensures that you are using the most up-to-date and effective models.

7. **Documentation and Resources**: Hugging Face provides extensive documentation and resources, including tutorials, examples, and community forums, which can help you get started with using pre-trained models.

If you meant model hosting, the advantages might be related to services like hosting pre-trained models on the cloud or using model hosting platforms that support model deployment. However, this is less common and typically involves different considerations, such as scalability, security, and cost. Please clarify your question to ensure I provide the most accurate and relevant information.
INFO:coglet:prediction completed: id=kn7yvy19gxrmc0csdmn97mdvdm

Version Details

Version ID: 232b6f87dac025cb54803cfbc52135ab8366c21bbe8737e11cd1aee4bf3a2423
Version Created: August 20, 2025

Run on Replicate →