[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: getting dpkg-shlibdeps to work on the Hurd



On Sat, May 18, 2002 at 10:08:41PM -0500, Adam Heath wrote:
> > Mmmh, well they should, it's required by POSIX[1].  From looking at the code,
> > Linux 2.4 behaves a lot better on this than previous versions, and the Hurd
> > got it right forever (do I dare to add "of course"? No, I don't ;).
> > I don't know about BSD, though (or other operating systems).
> > It's a pity that st_dev is so underrated, it is very useful for these
> > things.
> 
> DOS filesystems.

There is no reason not to give every file in a DOS filesystem a unique inode
number that doesn't change while the filesystem is mounted. 
There is absolutely no requirement to keep the number the same across
mounts.  That's what I did when implementing fat filesystem support for the Hurd.

This problem is the same for isofs, for example.  There the inode number
used in the Linux 2.4 kernels is taken from the directory offset on disks,
which btw. breaks hard links (two hard links to the same file have different
inode numbers).  With just a little bit of more work, you can get unique and
POSIXly correct inode numbers here, too, though.

In general, if the filesystem doesn't have inodes, or doesn't lead itself to
some inodeish number derived from other metadata (like directory offsets), just
give some random number to each node, and don't change it while the filesystem
is live.
 
> Also, why is using realpath so bad?  Do we have to use the fastest code?
> realpath() is so simple, that to complicate the code with other forms seems to
> not be worth it.

I guess it is not so bad, but still on GNU/Linux stat was twice as fast as file
name canonicalization.  I have not tried this on the Hurd yet, but it might
very well be similar.  So there is one order of magnitude between the two.  However,
if you don't have the file system in the cache, the bottleneck is definitely
disk I/O, and the actual difference between stat and canonicalization
contributes only about 8% to the total running time, which might still be
worth the effort if only for the worst case (looking up any */README.gz file
for example), though.

Well, from a programming point of view, comparing the stat fields is much
nicer than all the string handling, and the realpath() interface is
horribly broken[1].  We need to use he canonicalize_file_name() interface on
the Hurd (because we don't have a PATH_MAX limit, so any use of realpath is
a potential buffer overflow), so that's two cases at least anyway.

[1] We are working on fixing this.  Jeroen filed an objection with the
Austin group, and a transition path to a saner realpath interface is
outlined.

The test program is attached, if you want to try it out on your system,
maybe having lots of symlinks makes a difference:

ulysses:/tmp# time find /gnu/usr/share/doc | ./ftest /gnu/share/doc/zlib1g/copyright s[tat]
Found /gnu/usr/share/doc/zlib1g/copyright
...

ulysses:/tmp# time find /gnu/usr/share/doc | ./ftest /gnu/share/doc/zlib1g/copyright r[ealpath]
ulysses:/tmp# time find /gnu/usr/share/doc | ./ftest /gnu/share/doc/zlib1g/copyright c[anonicalize_file_name]

Of course, disk cache makes a huge difference.

Thanks,
Marcus

-- 
`Rhubarb is no Egyptian god.' Debian http://www.debian.org brinkmd@debian.org
Marcus Brinkmann              GNU    http://www.gnu.org    marcus@gnu.org
Marcus.Brinkmann@ruhr-uni-bochum.de
http://www.marcus-brinkmann.de
#define _GNU_SOURCE 1
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <limits.h>

int
main (int argc, char *argv[])
{
  char *search;
  struct stat searchstat;
  char *line = NULL;
  size_t linelen = 0;

  if (argc != 3)
    {
      fprintf (stderr, "Usage: %s FILE [r|c|s]\n", argv[0]);
      exit (1);
    }
  search = canonicalize_file_name (argv[1]);
  stat (search, &searchstat);

  while (getline (&line, &linelen, stdin) >= 0)
  {
    line[strlen(line) - 1] = '\0';
    switch (argv[2][0])
      {
      case 'r':
#ifdef PATH_MAX
	{
	  char rpath[PATH_MAX];
	  if (realpath (line, rpath))
	    if (!strcmp (search, rpath))
	      printf ("Found %s\n", line);
	}
#else
	fprintf (stderr, "Option %s not supported on this system.", argv[2]);
	exit (1);
#endif
	break;
      case 'c':
	{
	  char *rpath = canonicalize_file_name (line);
	  if (rpath)
	    {
	      if (!strcmp (search, rpath))
		printf ("Found %s\n", line);
	      free (rpath);
	    }
	}
	break;
      case 's':
	{
	  struct stat statbuf;
	  if (stat (line, &statbuf) >= 0)
	    {
	      if (searchstat.st_ino == statbuf.st_ino
		  && searchstat.st_dev == statbuf.st_dev)
		printf ("Found %s\n", line);
	    }
	}
	break;
      default:
	fprintf (stderr, "Unrecognized option %s\n", argv[2]);
	exit (1);
      }
  }
  exit (0);
}

Reply to: