[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFD] Debian's mime support

Aaron Lehmann wrote:
> I am a huge fan of file(1), and I love the idea of determining a file
> type based on a magic number. However, I am not so happy about the
> idea of the computer second-guessing me from my file naming habits.


Nevertheless, filenames are an important aid in recognizing the types of
names; it's a convention of many OSs. What Microsoft did wrong was to enforce
that in an inflexible and counter-intuitive way. You get to see a lot of
meaningless filename extensions on MS OSs.

The nice thing about filenames is that they are cheap. You don't have
to run a scanner to deduce the file type. Just parse the name.

I'd written up a design that had to deal with type files (in a build system)
The solution I found was similar. You can have multiple methods to deduce
file types. Now, since that was a build system there were other methods
which might be irrelevant here, but basically:

  * globs
  * regular expressions
  * file(1) like mechanism

can be used to deduce file type. These are to be tried from the cheapest
to the most expensive. So each heuristic has a cost associated
with it. In this case quite simple, globs 1, regexps 2 and file 3. So
whenever a cheap method fails to work correctly or reliably (assuming
that the program handling the file type returns an appropriate error
code), the more expensive method would be employed. Of course you can
collapse globs and regexps to only regexps so that we have basically
2 methods.

Whatever ;)

Then, as in the gnome thingy, each file type can define the deduction
methods and override these costs for themselves. (there might be no
filename convention for a lot of types) Generally, letting programmers
define multiple methods is a good thing. It would also be desirable
to let adding new recognition methods to the system. [Though most of those
extensions could possibly be merged in the file program itself which is
an excellent utility]

Giving the user the capability to override these for himself would be
possible if the program that (in some way) processed these entries
would be aware of the fact that users have home directories ;) All the
target environments (gnome, kde, emacs,...) would then use this program
to handle file types.

Eray (exa) Ozkural
Comp. Sci. Dept., Bilkent University, Ankara
e-mail: erayo@cs.bilkent.edu.tr
www: http://www.cs.bilkent.edu.tr/~erayo

Reply to: