Re: Debian app to read some MS file format?

On Sat, Sep 06, 2003 at 09:07:54PM -0600, Dave Thayer wrote:
> On Fri, Sep 05, 2003 at 08:02:24PM +0100, Pigeon wrote:
> > I've just been given a copy of the Farnell Electronics catalogue CD.
> > This has the unfortunate design of wanting to install some Windoze
> > package in order to read the catalogue.
> > 
> > I suspect that they've used a customised version of something fairly
> > common in the M$ world; the CD has a number of directories named
> > things like 'datadb', 'tabledb', 'worddb', containing files with the
> > suffices '.dat' and '.idx'.
> IIRC, the Adobe Acrobat catalog utility uses file names like this.
> This utility is used to make a full-text searchable index of a PDF
> collection. Perhaps there's another directory tree containing a bunch
> of PDFs.
> You have to use a version of Acrobat Reader with Search built in to
> access this index, but the linux version lacks this. You shuld still
> be able to view the PDFs with linux Acrobat, xpdf or ghostscript
> without the search capability.

Unfortunately, there are no separate PDFs. The .dat files appear to be
files of many different types - HTML, PNG, JPEG, and presumably PDF
(although I haven't positively identified a PDF yet, I know there
should be some there) - all concatenated together. It would be
possible - if tedious - to manually split the different bits apart,
but giving them meaningful filenames would be another matter - it
would mean manually indexing over 400MB...

I did find HTML tags like

<META name="oracleid" value="229934">
<META name="tocpath" value="1::Book 2;1::Electronic Components;1::Component Packaging - miniReelTM, Reels & Tubes;1::Capacitor/Resistor Networks">
<META name="smd_tiff" value="">
<META name="PageNo" value="">

in one of the .dat files, though only that one - perhaps an Oracle
database is involved? From what I can make out, dbishell can read
Oracle data, but it needs a perl module DBD::Oracle to be built, which
requires some of the proprietary Oracle code. So it looks like I may
be out of luck. In any case, dbishell is a command-line tool, and I
need a graphical app to read the graphical data.

I've also found some more database files with different suffices which
may give someone another clue?



