deepseek-ai/deepseek-coder-v2-lite-instruct 🔢📝 → 📝

▶️ 588 runs 📅 Jul 2024 ⚙️ Cog 0.9.12 🔗 GitHub 📄 Paper ⚖️ License

code-generation math-reasoning text-generation

About

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Example Output

Prompt:

"write a quick sort algorithm in python."

Output

def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)

print(quick_sort([3,6,8,10,1,2,1]))

Performance Metrics

1.64s Prediction Time

1.67s Total Time

All Input Parameters

{
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "write a quick sort algorithm in python.",
  "max_tokens": 512,
  "min_tokens": 0,
  "temperature": 0.6,
  "system_prompt": "You are an expert software engineer proficient in multiple programming languages.",
  "presence_penalty": 0,
  "frequency_penalty": 0
}

Input Parameters

top_k Type: integerDefault: 50: The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9: A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:: Prompt
max_tokens Type: integerDefault: 512: The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0: The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6: The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are an expert software engineer proficient in multiple programming languages.: System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
stop_sequences Type: string: A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
presence_penalty Type: numberDefault: 0: Presence penalty
frequency_penalty Type: numberDefault: 0: Frequency penalty

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

INFO 07-08 19:19:23 async_llm_engine.py:584] Received request 3bea0322126645299a822ea89c4e4778: prompt: 'write a quick sort algorithm in python.', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[100001], include_stop_str_in_output=False, ignore_eos=False, max_tokens=512, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: None, lora_request: None.
 stdoutINFO 07-08 19:19:23 metrics.py:341] Avg prompt throughput: 1.5 tokens/s, Avg generation throughput: 21.6 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.1%, CPU KV cache usage: 0.0%.
 stdoutGeneration took 1720466049.90sINFO 07-08 19:19:24 async_llm_engine.py:134] Finished request 3bea0322126645299a822ea89c4e4778.
 stdoutFormatted prompt: {system_prompt}

User: {prompt}

Assistant:

Version Details

Version ID: a182159595ee209e25e673a5ab6f0293111dd6733df973f77efcd1f851123330
Version Created: July 8, 2024

Run on Replicate →