adirik/mamba-2.8b-chat 🔢📝 → 📝

▶️ 814 runs 📅 Feb 2024 ⚙️ Cog 0.8.3 🔗 GitHub 📄 Paper ⚖️ License
code-generation question-answering text-generation

About

Mamba 2.8B state space language model fine tuned for chat

Example Output

Output

I do not have access to specific information about large language models. However, here are some general tips on deployment best practices:

  1. Use a cloud-based deployment platform: cloud-based deployment platforms like azure, aws, and google cloud provide a scalable and reliable environment for deploying large language models.

  2. Use a managed service: a managed service provides a pre-built and pre-configured environment for deploying large language models. This can save time and resources for deploying and maintaining the model.

  3. Use a reliable and secure network: ensure that the network is secure and reliable to ensure that the model is not compromised.

  4. Use a monitoring solution: monitor the model to ensure that it is functioning properly and that there are no issues with the deployment.

  5. Use a data protection solution: ensure that the model is protected from unauthorized access and data breaches.

  6. Use a data science platform: use a data science platform to manage the deployment and management of the model. This can help with scaling and automating the deployment process.

  7. Use a data science tool: use a data science tool to automate the deployment and management of the model. This can help with reducing the time and effort required for deployment.

overall, deploying a large language model can be a complex process, but with the right tools and best practices, it can be manageable.

Performance Metrics

17.38s Prediction Time
154.27s Total Time
All Input Parameters
{
  "top_k": 1,
  "top_p": 0.7,
  "message": "Do you know anything about large language models? Could you give me some tips on deployment best practices?",
  "temperature": 0.9,
  "message_history": "[]",
  "repetition_penalty": 1
}
Input Parameters
seed Type: integer
The seed for the random number generator
top_k Type: integerDefault: 1
When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens.
top_p Type: numberDefault: 0.7Range: 0.01 - 1
When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens.
message (required) Type: string
The message to generate a response for
temperature Type: numberDefault: 0.9Range: 0.1 - 5
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
message_history Type: stringDefault: []
The message history to generate a response for
repetition_penalty Type: numberDefault: 1Range: 0.01 - 10
Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.
Output Schema

Output

Type: string

Version Details
Version ID
54995daa413e1d85f27126266b8414fbc71fc879368fff2dc7cbfea60b87de31
Version Created
February 16, 2024
Run on Replicate →