Utility_Apps/Shell/AddOCR/readme.md



# 📘 **Add OCR Layer – Windows Context Menu Script**

This script adds an OCR text layer to any PDF using **OCRmyPDF**, with smart handling for PDFs that already contain text.

It integrates directly into the Windows right-click menu, so you can right-click any PDF → **Add OCR Layer**.

- - -

# ✅ **Features**

*   ✔ Right-click any PDF to run OCR

*   ✔ Detects:

    *   Tagged PDFs

    *   PDFs with pre-existing OCR

*   ✔ Prompts user when text already exists:

    *   **R** → `--redo-ocr` (best for mixed raster/vector)

    *   **F** → `--force-ocr` (overwrite all text)

    *   **S** → Skip OCR

*   ✔ Produces a new file with `_ocr.pdf` appended

*   ✔ Works even when OCRmyPDF returns ambiguous exit codes


- - -

# 📦 **Installation Guide**

## 1\. Install Python

Install Python 3.11 or later:
[https://www.python.org/downloads/](https://www.python.org/downloads/)
Be sure to check:

☑ **Add python.exe to PATH**

- - -

## 2\. Install OCRmyPDF

Open **Command Prompt** (Win+R → `cmd`) and install:

nginx

Copy code

`pip install ocrmypdf`

OCRmyPDF requires several external tools.

- - -

## 3\. Install Ghostscript

Required for rasterizing pages:

nginx

Copy code

`choco install ghostscript`

Or download manually:
[https://ghostscript.com/releases/index.html](https://ghostscript.com/releases/index.html)

- - -

## 4\. Optional: Install Tesseract

OCRmyPDF bundles a basic engine, but Tesseract yields better results:

nginx

Copy code

`choco install tesseract`

Or install manually from UB Mannheim builds.

- - -

## 5\. Copy the Script

Save the provided batch script as:

makefile

Copy code

`C:\Tools\add_ocr_layer.bat`

(You may place it anywhere, but avoid locations that sync to the cloud.)

- - -

## 6\. Add “Add OCR Layer” to Right-Click Menu

### Automated (recommended)

Create a `.reg` file:

swift

Copy code

`Windows Registry Editor Version 5.00  [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer] @="Add OCR Layer"  [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer\command] @="\"C:\\Tools\\add_ocr_layer.bat\" \"%1\""`

Double-click to install.

### Manual (if needed)

Navigate to:

Copy code

`Computer\HKEY_CLASSES_ROOT\*\shell\`

Create key: `Add OCR Layer`
Inside it, create key: `command`
Set default value to:

perl

Copy code

`"C:\Tools\add_ocr_layer.bat" "%1"`

- - -

# ▶️ **Usage**

### **Right-click any PDF → Add OCR Layer**

The script will:

1.  Show the file path

2.  Run OCRmyPDF

3.  Detect if pages contain text

4.  If text is found, it will prompt:


sql

Copy code

`Choose how to proceed:   R = Use --redo-ocr (raster areas only)   F = Use --force-ocr (overwrite all text)   S = Skip OCR`

5.  Your OCR’d file will be saved as:


Copy code

`original_filename_ocr.pdf`

- - -

# ⚠ Troubleshooting

### **Ghostscript not found (‘gs’ missing)**

Install via Chocolatey:

nginx

Copy code

`choco install ghostscript`

Or add Ghostscript’s `bin` folder to PATH manually.

- - -

### **OCRmyPDF not found**

Ensure Python Scripts folder is in PATH:

makefile

Copy code

`C:\Users\<you>\AppData\Local\Programs\Python\Python312\Scripts\`

- - -

### **TaggedPDFError appears and OCR stops**

This script handles it automatically and will offer choices.

- - -

# 🧪 Tested On

*   Windows 10

*   Windows 11

*   Python 3.12

*   OCRmyPDF 15.x

*   Ghostscript 10.x

*   Tesseract 5.x