datalab-to/ocr 🖼️🔢✓📝 → ❓
About
Detect and transcribe text in images with accurate bounding boxes, layout analysis, reding order, and table recognition, in 90 languages
Example Output
Output
{"text":"One Ring to rule them all
One Ring to find them
One Ring to bring them all
and in the darkness bind them
In the Land of Mordor
where the Shadows lie","pages":null,"page_count":null,"visualizations":["https://replicate.delivery/xezq/94Y888Ag936zHJVPRelwGT1eIKoAviU6w9TOSrF6BiVkmvfqA/visualization.jpg"]}
One Ring to find them
One Ring to bring them all
and in the darkness bind them
In the Land of Mordor
where the Shadows lie","pages":null,"page_count":null,"visualizations":["https://replicate.delivery/xezq/94Y888Ag936zHJVPRelwGT1eIKoAviU6w9TOSrF6BiVkmvfqA/visualization.jpg"]}
Performance Metrics
5.44s
Prediction Time
5.45s
Total Time
All Input Parameters
{
"file": "https://collections-zoo-output.replicate.dev/predictions/78j9fndvp1rmc0csx5gbhrqgpg/1760557063643-jaj2s8zkj6.jpg",
"visualize": true,
"skip_cache": false,
"return_pages": false
}
Input Parameters
- file (required)
- Input file. Must be one of: .pdf, .doc, .docx, .ppt, .pptx, .png, .jpg, .jpeg, .webp
- max_pages
- Maximum number of pages to process. Cannot be specified if page_range is set - these parameters are mutually exclusive
- visualize
- Draw red polygons on the input image(s) to visualize detected text regions and return the annotated images
- page_range
- Page range to parse, comma separated like 0,5-10,20. Example: '0,2-4' will process pages 0, 2, 3, and 4. Cannot be specified if max_pages is set - these parameters are mutually exclusive
- skip_cache
- Bypass the server-side cache and force re-processing. By default, identical requests are cached to save time and cost. Enable this to get fresh results
- return_pages
- Return detailed page information including text lines, bounding boxes, polygons, and character-level data. When disabled, only text and page_count will be returned
Output Schema
Example Execution Logs
Processing OCR with request ID: dogWFkJYbhvM-6FamjsYDA OCR processed in 5.2sec
Version Details
- Version ID
3e6db0d5311d6fdc232eea333c1e26055ba4e542180043f12acb2967e5c77f4a- Version Created
- October 20, 2025