deepseek-ai/janus-pro-1b 🔢🖼️📝 → 📝

▶️ 6.7K runs 📅 Feb 2025 ⚙️ Cog 0.13.7 🔗 GitHub 📄 Paper ⚖️ License
image-to-text ocr visual-question-answering visual-understanding vqa

About

Janus-Pro is a novel autoregressive framework for multimodal understanding

Example Output

Output

Here is the formula in LaTeX code:

[
A_n = a_0 \left[ 1 + \frac{3}{4} \sum_{k=1}^{n} \left( \frac{4}{9} \right)^k \right]
]

Performance Metrics

2.55s Prediction Time
91.12s Total Time
All Input Parameters
{
  "seed": 42,
  "image": "https://replicate.delivery/pbxt/MUJhLC1lVS5HVeLXvmbOL1O2ESVNYCGNVoxqJumiRUn0Hl99/equation.png",
  "top_p": 0.95,
  "question": "Convert the formula into latex code.",
  "temperature": 0.1
}
Input Parameters
seed Type: integerDefault: 42
Random seed for reproducibility
image (required) Type: string
Input image for multimodal understanding
top_p Type: numberDefault: 0.95Range: 0 - 1
Top-p sampling value
question (required) Type: string
Question about the image
temperature Type: numberDefault: 0.1Range: 0 - 1
Temperature for text generation
Output Schema

Output

Type: string

Version Details
Version ID
eb4c5dffb46fb23d03a3b74e87ea36bfa830a0c3d5875ef2bfea310646ac8fd2
Version Created
February 12, 2025
Run on Replicate →