kcaverly/deepseek-coder-33b-instruct-gguf 📝🔢 → 📝

▶️ 3.3K runs 📅 Dec 2023 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License
code-completion code-generation text-generation

About

A quantized 33B parameter language model from Deepseek for SOTA repository level code completion

Example Output

Prompt:

"please create a rust enum called prediction status, with three variants starting, in progress and completed. Please only include valid rust code, do not include any commentary or explanations."

Output

enum PredictionStatus {
    Starting,
    InProgress,
    Completed,
}

Performance Metrics

1.51s Prediction Time
1.59s Total Time
All Input Parameters
{
  "prompt": "please create a rust enum called prediction status, with three variants starting, in progress and completed. Please only include valid rust code, do not include any commentary or explanations.",
  "temperature": 0.8,
  "system_prompt": "You are an AI programming assistant, utilizing the Deepseek Code model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.",
  "max_new_tokens": -1,
  "repeat_penalty": 1.1,
  "prompt_template": "{system_prompt}/n### Instruction: {prompt}/n### Response: "
}
Input Parameters
prompt (required) Type: string
Instruction for model
temperature Type: numberDefault: 0.8
This parameter used to control the 'warmth' or responsiveness of an AI model based on the LLaMA architecture. It adjusts how likely the model is to generate new, unexpected information versus sticking closely to what it has been trained on. A higher value for this parameter can lead to more creative and diverse responses, while a lower value results in safer, more conservative answers that are closer to those found in its training data. This parameter is particularly useful when fine-tuning models for specific tasks where you want to balance between generating novel insights and maintaining accuracy and coherence.
system_prompt Type: stringDefault: You are an AI programming assistant, utilizing the Deepseek Code model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer.
System prompt for the model, helps guides model behaviour.
max_new_tokens Type: integerDefault: -1
Maximum new tokens to generate.
repeat_penalty Type: numberDefault: 1.1
This parameter plays a role in controlling the behavior of an AI language model during conversation or text generation. Its purpose is to discourage the model from repeating itself too often by increasing the likelihood of following up with different content after each response. By adjusting this parameter, users can influence the model's tendency to either stay within familiar topics (lower penalty) or explore new ones (higher penalty). For instance, setting a high repeat penalty might result in more varied and dynamic conversations, whereas a low penalty could be suitable for scenarios where consistency and predictability are preferred.
prompt_template Type: stringDefault: {system_prompt}/n### Instruction: {prompt}/n### Response:
Template to pass to model. Override if you are providing multi-turn instructions.
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
Llama.generate: prefix-match hit
llama_print_timings:        load time =     279.99 ms
llama_print_timings:      sample time =       4.17 ms /    29 runs   (    0.14 ms per token,  6954.44 tokens per second)
llama_print_timings: prompt eval time =     219.64 ms /    15 tokens (   14.64 ms per token,    68.29 tokens per second)
llama_print_timings:        eval time =    1226.00 ms /    28 runs   (   43.79 ms per token,    22.84 tokens per second)
llama_print_timings:       total time =    1501.62 ms
Version Details
Version ID
ea964345066a8868e43aca432f314822660b72e29cab6b4b904b779014fe58fd
Version Created
December 11, 2023
Run on Replicate →