cudanexus/nougat 🖼️ → 🖼️

▶️ 242 runs 📅 Jan 2024 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License
ocr pdf-to-text

About

Nougat: Neural Optical Understanding for Academic Documents

Example Output

Output

Example output

Performance Metrics

18.41s Prediction Time
390.32s Total Time
Input Parameters
pdf_file (required) Type: string
input the pdf
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
file_name is - /tmp/tmpxzhzbfnicalculus00marciala_0136.pdf
running---------subprocess
CompletedProcess(args=['nougat', '--out', 'output', 'pdf', '/tmp/tmpxzhzbfnicalculus00marciala_0136.pdf', '--checkpoint', 'nougat', '--markdown', '--no-skipping'], returncode=0, stdout='', stderr='/root/.pyenv/versions/3.8.18/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.)\n  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]\n\n  0%|          | 0/1 [00:00<?, ?it/s]INFO:root:Processing file /tmp/tmpxzhzbfnicalculus00marciala_0136.pdf with 1 pages\n\n100%|██████████| 1/1 [00:06<00:00,  6.30s/it]\n100%|██████████| 1/1 [00:06<00:00,  6.30s/it]\n-> Cannot close object, library is destroyed. This may cause a memory leak!\n')
----------------- /src
/src/output/tmpxzhzbfnicalculus00marciala_0136_formatted.txt
Version Details
Version ID
d0b4e90da423598ff84debc9115bf891dd819843600ad842c0c178e3571f9e76
Version Created
January 3, 2024
Run on Replicate →