Bug#1017872: RFA: ocrmypdf -- add an OCR text layer to PDF files

To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: Bug#1017872: RFA: ocrmypdf -- add an OCR text layer to PDF files
From: Sean Whitton <spwhitton@spwhitton.name>
Date: Sun, 21 Aug 2022 14:53:17 -0700
Message-id: <YwKpTfcU/gqnqmfH@melete.silentflame.com>
Reply-to: Sean Whitton <spwhitton@spwhitton.name>, 1017872@bugs.debian.org

Package: wnpp
Severity: normal
X-Debbugs-Cc: debian-python@lists.debian.org, barlow.jim@gmail.com
Control: affects -1 src:ocrmypdf

I request an adopter for the ocrmypdf package.  I don't use it as often
as I did (hardly ever the past couple of years), and anyway it would be
better for a Python programmer to maintain it.

The package description is:
 OCRmyPDF generates a searchable PDF/A file from a regular PDF
 containing only images, allowing it to be searched.
 .
 It uses the Tesseract OCR engine and so supports all the languages
 that Tesseract does.
 .
 Some other main features:
 .
   * Places OCR text accurately below the image to ease copy / paste
   * Keeps the exact resolution of the original embedded images
   * When possible, inserts OCR information as a lossless operation
     without rendering vector information
   * Keeps file size about the same
   * If requested deskews and/or cleans the image before performing OCR
   * Validates input and output files
   * Provides debug mode to enable easy verification of the OCR results
   * Processes pages in parallel when more than one CPU core is
     available
   * Battle-tested on thousands of PDFs, a test suite and continuous
     integration.

-- 
Sean Whitton

Attachment: signature.asc
Description: PGP signature

Reply to:

Prev by Date: Bug#1017873: RFA: pikepdf
Next by Date: Re: Python in the Debian infrastructure
Previous by thread: Bug#1017873: RFA: pikepdf
Next by thread: mailman2-python3
Index(es):
- Date
- Thread