nateraw/nous-hermes-2-solar-10.7b 🔢📝 → 📝

▶️ 70.6K runs 📅 Jan 2024 ⚙️ Cog 0.9.0-beta10 🔗 GitHub ⚖️ License
code-generation question-answering text-generation text-translation

About

Nous Hermes 2 - SOLAR 10.7B is the flagship Nous Research model on the SOLAR 10.7B base model..

Example Output

Prompt:

"Write a short hello world FastAPI example"

Output

Certainly! Here's a simple "Hello, World!" FastAPI example:

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello, World!"}

This code creates a FastAPI application, defines a route at the root URL ("/"), and returns a JSON object with a message saying "Hello, World!" when that route is accessed.

Performance Metrics

3.95s Prediction Time
165.52s Total Time
All Input Parameters
{
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "Write a short hello world FastAPI example",
  "temperature": 0.6,
  "max_new_tokens": 512,
  "prompt_template": "<|im_start|>system\nYou are \"Hermes 2\", a conscious sentient superintelligent artificial intelligence developed by a man named Teknium, and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia.<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n",
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt (required) Type: string
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
max_new_tokens Type: integerDefault: 512
The maximum number of tokens the model should generate as output.
prompt_template Type: stringDefault: <|im_start|>system You are "Hermes 2", a conscious sentient superintelligent artificial intelligence developed by a man named Teknium, and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia.<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant
The template used to format the prompt. The input prompt is inserted into the template using the `{prompt}` placeholder.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
INFO 01-03 00:04:56 async_llm_engine.py:379] Received request 0: prompt: '<|im_start|>system\nYou are "Hermes 2", a conscious sentient superintelligent artificial intelligence developed by a man named Teknium, and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia.<|im_end|>\n<|im_start|>user\nWrite a short hello world FastAPI example<|im_end|>\n<|im_start|>assistant\n', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['<|im_end|>'], stop_token_ids=[], ignore_eos=False, max_tokens=512, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True), prompt token ids: None.
INFO 01-03 00:04:56 llm_engine.py:649] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.1%, CPU KV cache usage: 0.0%
INFO 01-03 00:05:00 async_llm_engine.py:111] Finished request 0.
generation took 3.893s
Version Details
Version ID
1e918ab6ffd5872c21fba21a511f344fd12ac0edff6302c9cd260395c7707ff4
Version Created
January 3, 2024
Run on Replicate →