[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Magic for OpenOffice (file)



On Tue, 25 May 2004 07:56:28 -0700
keith@ahapala.net (Keith Nasman) wrote:

> On Tue, May 25, 2004 at 09:47:16AM -0400, Gregory Seidman wrote:
> > On Tue, May 25, 2004 at 08:33:19AM +0200, Magnus Therning wrote:
> > } A quick search in Google didn't reveal any solution (only found
> > one} reference, in Japanese).
> > } 
> > }  $ file -i file.sxw
> > }  file.sxw: application/x-zip
> > } 
> > } It would be really nice if 'file' could give proper the correct
> > type for} OpenOffice documents. Anyone who has an entry for
> > /etc/magic that make} sit happen?
> > 
> > This is a deeper problem than just OOorg. There has been
> > dissatisfaction with file's reporting for
> > compressed/gzipped/bzipped/zipped files for a good long time, and
> > the idea of having file actually decompress some of the data to get
> > a more accurate result has come up in the past.
<snip>
> > 
> > Ultimately, I'd love to see it done, and I encourage you to get
> > programming.
> 
> What's kind of ironic is that the first line of the files states the
> MIME type in ASCII.
> 
> keith@r31:~/docs$ strings test.sxw | head -n1
> mimetypeapplication/vnd.sun.xml.writerPK
> keith@r31:~/docs$ strings test.sxc | head -n1
> mimetypeapplication/vnd.sun.xml.calcPK

Actually, only OO.org 1.1 declares its MIME type. Doing a strings
test.sxw | head -n1 on a file created by OO.org 1.0 simply returns
"content.xml". 

That said though, I _prefer_ file showing it as a compressed file. After
all, if I wanted to read the content, I would first need to uncompress
it, or use a utility that will decompress text on the fly (zless comes
to mind). Then there's also the fact that it's not just a single file
compressed in a zip archive. If you run unzip on a OO.org 1.1 file, it
extracts the following files:

mimetype                
content.xml             
styles.xml              
meta.xml                
settings.xml            
META-INF/manifest.xml

Running unzip on a OO.org 1.0 file returns much the same results,
_except_ it does not have the mimetype file in it.

I suppose there might be room to expand the information file returns,
such as what type of file is inside the zip file, but if the zip file
has very many individual files in it, this could take forever, produce a
lot of output, etc. So, I think file accomplishes it's goal; I run file
to know what tool(s) I need to work with that file. The first tool
needed is a decompression utility. Then I can run file again on the
individual .xml files to see that they are "XML 1.0 document text".

Jacob

-- 
GnuPG Key: 1024D/16377135

Random .signature #63:
Microsoft has combined the strengths of its three most powerful
operating systems to create its next generation operating system:
Windows CE+ME+NT

As hard as a rock and as dumb as a brick!
http://www.6texans.net/img/msc.jpg

Attachment: pgpAthd2UDJTN.pgp
Description: PGP signature


Reply to: