[OT] Splitting multipage TIFF files
A client has landed us with a bunch of CD's containing several large
multipage TIFF images. I ran them through our usual conversion tools
(namely a find script that passes the images through tiffsplit), and all
seemed fine and dandy.
However, on inspecting the output, it seems that half of the files are JPEG
encoded in a TIFF wrapper, which tiffsplit can't handle, and neither can
our £5000-worth of Adobe Capture. The result is a garblified TIFF that I
can't even render, let alone OCR.
A google returned the following snippet from the libtiff mailing list:
2004.02.26 10:27 "Re: tiffsplit & JPEG compression", by Andrey Kiselev
On Wed, Feb 25, 2004 at 06:13:04PM +0300, Artem Mirolubov wrote:
> tiffsplit dont copy pages with JPEG compression.
> what tags i must copy with CopyField, to add such support?
I have fixed that problem, thank you for report. We need to copy
contents of the TIFFTAG_JPEGTABLES tag.
> And what tags i must copy, to add support of TIFF files with JPEG
> compression version 6.0 specification (Plz dont tell i dont need it. I
> really need it! And i defined "never" in "tif_ojpeg.c":) ?
Well, if you have enough sample files you can experiment with all tags,
defined in ojpegFieldInfo (see tif_ojpeg.c file).
Alot of that is greek to me, but it seems that the gist is that JPEG-TIFF
support has been added.
Does anyone know if these changes have made it into the current versions of
the Debian TIFF utils? Or do I need to build myself a customised TIFF
library (argh!)? Failing that, does anyone know of any alternate way to
batch-convert JPEG-TIFF's (preferably in Linux)? I've already tried using
imagemagick, but it has some serious problems dealing with multipage TIFFs
(namely trying to do all 80MB of a file at once, and running the system
into the ground).
If anyone out there has more of a clue than me, I'd be much obliged!