pbevan1/llama-3.1-8b-ocr-correction 📝 → 📝
About
LLaMA 3.1-8B, finetuned on a synthetic OCR dataset for superior OCR correction.
Example Output
Output
Do Not Rule Out Apple iPod-Mac Tie-Up: Analyst (Reuters) Reuters - A Piper Jaffray analyst said on Thursday he does not rule out Apple Computer Inc. selling a lower-priced version of its Macintosh computer to attract consumers already enamored of its iPod music player and annoyed by security problems with Windows PCs.
Performance Metrics
10.29s
Prediction Time
102.74s
Total Time
All Input Parameters
{ "inp": "Do Not Kule Oi't hy.er-l'rieed AjijqIi: imac - Analyst (fteuiers) Hcuiers - A | ) | ilf, <;/) in |) nter |iic . conic! deeiilf. l.o sell n lower-|)rieofl wersinn oi its Macintosh cornutor to nttinct ronsnnu-rs already euami'red ot its iPod music jiayo-r untl annoyoil. by sccnrit.y problems ivitJi Willtlows PCs , Piper.iaffray analyst. (Jcne Muster <aid on Tlinrtiday.", "top_k": 50, "top_p": 1, "do_sample": false, "instruction": "You are an assistant that takes a piece of text that has been corrupted during OCR digitisation, and produce a corrected version of the same text.", "temperature": 1, "max_new_tokens": 5000, "repetition_penalty": 1 }
Input Parameters
- inp (required)
- Input text to correct
- instruction (required)
- Instruction for the model
Output Schema
Output
Example Execution Logs
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Version Details
- Version ID
83ae7a3e0d977f581798ee6b8add1bc4e690bf30f5552589ddd12d6afaf1ec01
- Version Created
- July 30, 2024