Re: document archiving w/ scanner
On Wed, Jul 14, 2004 at 12:38:04AM -0400, Mark Roach wrote:
> On Sat, 2004-07-10 at 01:14 +0200, martin f krafft wrote:
> > also sprach William Ballard <40618.nospam@comcast.net> [2004.07.10.0041 +0200]:
> > > Search the archives for my and other's discussions about project
> > > gutenbergs tests with gocr and other open source OCR programs.
> >
> > great pointer. I guess the conclusion here is that gocr and clara
> > pretty much suck and for any serious work, I have to go with
> > OmniPage or other commercial products. Damn.
>
> At my last employer, I used Ascent Capture (on windows) to scan images
> and index them against a postgresql+debian server and used a wxPython
> application I wrote to search and view them. We used indexing info
> (date, names, etc.) instead of the text of the documents, but Ascent
> Capture can do that too. Obviously there are non-free parts to that
> solution, but that was the best I was able to come up with. If you'd
> like some more info on that setup feel free to drop me a line off-list.
I scan every piece of paper with my name on it and every receipt even
for bubble gum and shred the originals. All you need is a clever
directory structure and some hacky little scripts. I scan most things
at 150dpi as .png and produce 50% sized images for eyeballing. A script
builds web pages with <img> tags of the 75dpi images, with an <a> link
to the larger image when you click on it. It works good enough. I
produce about 3GB of scans per year.
The hardest part is shelling around a directory structure like
/paper/d4/BigOldBank/40713/0{1,2,3}.png
/paper/d4/BigOldBank/Slips/Cash/40713-McDonalds.png
/paper/d4/PowerCompany/40614/...
/paper/d4/E-Broker/31231~30101/01.png
that's a lot of keystrokes when you're scanning. I wrote a little GUI
app I plan to put on SourceForge that lets me pick the elements from
lists and has a calendar to enter dates, then renames them.
I find things like PaperPort constraining. I like my hacky-scripts,
like a lot of Linux things they are a bit hacky but make you happy!
Reply to: