deepseek-ai/deepseek-coder-v2-lite-instruct 🔢📝 → 📝

▶️ 588 runs 📅 Jul 2024 ⚙️ Cog 0.9.12 🔗 GitHub 📄 Paper ⚖️ License
code-generation math-reasoning text-generation

About

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Example Output

Prompt:

"write a quick sort algorithm in python."

Output

def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)

print(quick_sort([3,6,8,10,1,2,1]))

Performance Metrics

1.64s Prediction Time
1.67s Total Time
All Input Parameters
{
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "write a quick sort algorithm in python.",
  "max_tokens": 512,
  "min_tokens": 0,
  "temperature": 0.6,
  "system_prompt": "You are an expert software engineer proficient in multiple programming languages.",
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:
Prompt
max_tokens Type: integerDefault: 512
The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0
The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are an expert software engineer proficient in multiple programming languages.
System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
INFO 07-08 19:19:23 async_llm_engine.py:584] Received request 3bea0322126645299a822ea89c4e4778: prompt: 'write a quick sort algorithm in python.', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[100001], include_stop_str_in_output=False, ignore_eos=False, max_tokens=512, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: None, lora_request: None.
 stdoutINFO 07-08 19:19:23 metrics.py:341] Avg prompt throughput: 1.5 tokens/s, Avg generation throughput: 21.6 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.1%, CPU KV cache usage: 0.0%.
 stdoutGeneration took 1720466049.90sINFO 07-08 19:19:24 async_llm_engine.py:134] Finished request 3bea0322126645299a822ea89c4e4778.
 stdoutFormatted prompt: {system_prompt}

User: {prompt}

Assistant:
Version Details
Version ID
a182159595ee209e25e673a5ab6f0293111dd6733df973f77efcd1f851123330
Version Created
July 8, 2024
Run on Replicate →