pbevan1/llama-3.1-8b-ocr-correction 📝 → 📝

▶️ 52 runs 📅 Jul 2024 ⚙️ Cog 0.9.13 🔗 GitHub
ocr-correction text-generation

About

LLaMA 3.1-8B, finetuned on a synthetic OCR dataset for superior OCR correction.

Example Output

Output

Do Not Rule Out Apple iPod-Mac Tie-Up: Analyst (Reuters) Reuters - A Piper Jaffray analyst said on Thursday he does not rule out Apple Computer Inc. selling a lower-priced version of its Macintosh computer to attract consumers already enamored of its iPod music player and annoyed by security problems with Windows PCs.

Performance Metrics

10.29s Prediction Time
102.74s Total Time
All Input Parameters
{
  "inp": "Do Not Kule Oi't hy.er-l'rieed AjijqIi: imac - Analyst (fteuiers) Hcuiers - A | ) | ilf, <;/) in |) nter |iic . conic! deeiilf. l.o sell n lower-|)rieofl wersinn oi its Macintosh cornutor to nttinct ronsnnu-rs already euami'red ot its iPod music jiayo-r untl annoyoil. by sccnrit.y problems ivitJi Willtlows PCs , Piper.iaffray analyst. (Jcne Muster <aid on Tlinrtiday.",
  "top_k": 50,
  "top_p": 1,
  "do_sample": false,
  "instruction": "You are an assistant that takes a piece of text that has been corrupted during OCR digitisation, and produce a corrected version of the same text.",
  "temperature": 1,
  "max_new_tokens": 5000,
  "repetition_penalty": 1
}
Input Parameters
inp (required) Type: string
Input text to correct
instruction (required) Type: string
Instruction for the model
Output Schema

Output

Type: string

Example Execution Logs
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Version Details
Version ID
83ae7a3e0d977f581798ee6b8add1bc4e690bf30f5552589ddd12d6afaf1ec01
Version Created
July 30, 2024
Run on Replicate →