deepseek-ai/janus-pro-7b 🔢🖼️📝 → 📝

▶️ 13.3K runs 📅 Feb 2025 ⚙️ Cog 0.13.6 🔗 GitHub 📄 Paper ⚖️ License
image-to-text ocr visual-question-answering

About

Janus-Pro is a novel autoregressive framework for multimodal understanding

Example Output

Output

Here is the formula in LaTeX code:

[
A_n = a_0 \left[ 1 + \frac{3}{4} \sum_{k=1}^{n} \left( \frac{4}{9} \right)^k \right]
]

Performance Metrics

2.53s Prediction Time
72.75s Total Time
All Input Parameters
{
  "seed": 42,
  "image": "https://replicate.delivery/pbxt/MR7VaVSjxG96N6hB8frEioG1sBaqsbV0Velueqh8yr7H9piP/equation.png",
  "top_p": 0.95,
  "question": "Convert the formula into latex code.",
  "temperature": 0.1
}
Input Parameters
seed Type: integerDefault: 42
Random seed for reproducibility
image (required) Type: string
Input image for multimodal understanding
top_p Type: numberDefault: 0.95Range: 0 - 1
Top-p sampling value
question (required) Type: string
Question about the image
temperature Type: numberDefault: 0.1Range: 0 - 1
Temperature for text generation
Output Schema

Output

Type: string

Version Details
Version ID
fbf6eb41957601528aab2b3f6d37a287015d9f486c3ac4ec6e80f04744ac1a32
Version Created
February 3, 2025
Run on Replicate →