This is a partial answer to some of your questions On Tue, Feb 01, 2005 at 05:00:34AM +0000, Stephen Tait wrote: > Yargh, this has to be the worst job ever. A client has landed a rush job on > us in that they need about 6GB of TIFF files (from scanned documents) > converted into OCR'd PDF's within the next 3-4 days. Yay. > > The problem is the images we've been given are utter cack. Firstly, because > of Kodak's seeming inability to produce decent drivers, the pages are > inexplicably surrounded by huge black borders. Because the papers are all > of different sizes and aspect ratios, there is no way I can do an > ImageMagick/IrfanView batch crop on them. I found the GIMP's autocrop tool > worked perfectly on the sample images I downloaded to my home machine, so > I've set about making a batch process script for it. > I managed to learn to use perl-fu reasonably quickly. I'm not such a big fan of the "Lots of Isolated Silly Parentheses" like stuff found in scheme. It might be easier to do the general file management stuff in perl than in scrpt-fu as well. > Secondly, as well as mixed borders, we also have mixed TIFF formats. Most > of the images are in standard 2-colour Group4 fax compression, but they're > interleaved with 256 colour LZW TIFF's and 24bit colour JPEG TIFF's. > > So I'm frantically trying to come up with a script-fu for GIMP that can > take all of this into account. I'm currently attempting to use the glob > plugin to take care of the fact we have around 200 directories full of > images (so it'll process /path/to/images/*/*.tif), although I'm not sure if > the glob in GIMP 2.2.0 supports this (although I'll have access to 2.2.2 at > work). > > Secondly, what is a drawable? I've seen vague references to it in the > script-fu guides I've skimmed over, but I am at a loss to see where they > fit into the bigger picture. Do I need to convert a file into a drawable, > run the autocrop on that, and then pipe that into an image somehow? > I think it's something you can draw to - like a layer or a mask or a channel, etc > Thirdly, does anyone know if there's some way to specify a "save this TIFF > in the same format/colour depth you opened it in" option? If not, I can > save all output as full colour LZW's and let Adobe Capture do the > resampling, but it's going to cost a fortune in discspace and I doubt the > client will be happy either. > One issue you will have, is that gimp doesn't have a true bilevel mode (a *big* shortfall if you ask me). It will load the faxg4 images as greyscale and the best it can do is to save as a paletted image wth a palette containing only two colours. You'll then have to convert back to tiffg4 using ImageMagick - but this might be possibly to do within your perl script or maybe scheme aswell - I've not looked far enough. As far as telling the difference, maybe there is a gimp scripting function to count the number of unique colours? > Sorry to sound like a complete I-don't-know-what-I'm-doing, but I probably > don't I'm chowing my way through HOWTO's in between trying to write the > script as we speak, but would greatly appreciate any advice the community > could give me, if only to be able to get to bed before the sun comes > up Yay for still working at 5am > > P.S. if anyone has any other suggestions for applications that'll batch > remove arbitary black borders (be they Linux or Windows, although I don't > have the power to buy stuff myself) they too would be welcome. > > -- David Purton dcpurton@chariot.net.au For the eyes of the LORD range throughout the earth to strengthen those whose hearts are fully committed to him. 2 Chronicles 16:9a
Attachment:
signature.asc
Description: Digital signature