lucataco/deepseek-ocr 🖼️❓📝 → 📝

▶️ 52.9K runs 📅 Oct 2025 ⚙️ Cog 0.16.8 🔗 GitHub ⚖️ License

document-parsing document-to-markdown image-to-text markdown-conversion ocr

About

Convert documents to markdown, extract raw text, and locate specific content

Example Output

Output

SUNOCO

SUN0C00004657307

515HADDON AVE

HADDONFIELD NJ08033

TIDH6-34-654358-001

7828MC

UNLD REG FULL

16.1209@$3.629$58.50

TOTAL

$58.50

DATE:02/22/13TIME:10:57

TRANSI029565

AUTH02243P

CUSTOMER AGREES TO PAY THE ABOVE

TOTAL AMOUNT ACCORDING TO THE

CARD ISSUER AGREEMENT.

RCOPY

Performance Metrics

5.82s Prediction Time

68.02s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/NvV3TU9BGDmP2oulWICck7cJ2PIUp2bvdgvQriosMXplt3Ez/receipt-photo.jpg",
  "task_type": "Free OCR",
  "reference_text": "",
  "resolution_size": "Gundam (Recommended)"
}

Input Parameters

image (required) Type: string: Input image to perform OCR on (supports documents, charts, tables, etc.)
task_type Default: Convert to Markdown: Type of OCR task to perform
reference_text Type: stringDefault:: Reference text to locate in the image (only used for 'Locate Object by Reference' task). Example: 'the teacher', '20-10', 'a red car'
resolution_size Default: Gundam (Recommended): Model resolution size - affects speed and accuracy trade-off

Output Schema

Output

Type: string

Example Execution Logs

Task: Free OCR
Resolution: Gundam (Recommended)
Parameters: base_size=1024, image_size=640, crop_mode=True
The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.
`get_max_cache()` is deprecated for all Cache classes. Use `get_max_cache_shape()` instead. Calling `get_max_cache()` will raise error from v4.48
=====================
BASE:  torch.Size([1, 256, 1280])
PATCHES:  torch.Size([4, 100, 1280])
=====================
The attention layers in this model are transitioning from computing the RoPE embeddings internally through `position_ids` (2D tensor with the indexes of the tokens), to using externally computed `position_embeddings` (Tuple of tensors, containing cos and sin). In v4.46 `position_ids` will be removed and `position_embeddings` will be mandatory.
SUNOCO
SUN0C00004657307
515HADDON AVE
HADDONFIELD NJ08033
TIDH6-34-654358-001
7828MC
UNLD REG FULL
16.1209@$3.629$58.50
TOTAL
$58.50
DATE:02/22/13TIME:10:57
TRANSI029565
AUTH02243P
CUSTOMER AGREES TO PAY THE ABOVE
TOTAL AMOUNT ACCORDING TO THE
CARD ISSUER AGREEMENT.
RCOPY
==================================================
image size:  (800, 798)
valid image tokens:  655
output texts tokens (valid):  127
compression ratio:  0.19
==================================================

Version Details

Version ID: cb3b474fbfc56b1664c8c7841550bccecbe7b74c30e45ce938ffca1180b4dff5
Version Created: October 21, 2025

Run on Replicate →