Re: A success story with apt and rsync
On Sun, Jul 06, 2003 at 11:36:34PM +0100, Andrew Suffield wrote:
>
> I can only presume this is new or obscure, since everything I tried
> had the traditional behaviour. Can't see how to turn it on, either.
>
It's new for 2.5. Backports to 2.4 are available here:
http://thunk.org/tytso/linux/extfs-2.4-update/extfs-update-2.4.21
For those who are interested, the broken out patches can be found here:
http://thunk.org/tytso/linux/extfs-2.4-update/broken-out-2.4.21/to-apply
Once you have a htree-enabled kernel, you enable a filesystem to use
the feature by using the following command:
tune2fs -O dir_index /dev/hdXX
Optionally, you can reorganize all of the directories to use btrees by
using the command "e2fsck -fD /dev/hdXX". Otherwise, only directories
that are expanded beyond a single block after you set the dir_index
flag will use htrees. The dir_index is a fully compatible extension,
so it's perfectly safe to mount a filesystem with htrees on a
non-htree kernel. A non-htree kernel will just ignore the b-tree
information, and if it attempts to modify a hash-tree directory, it
will just invalidate the htree interior node information, so that the
directory becomes unindexed until e2fsck -fD is run over the
filesystem to which optmizes all of the directories by reindexing them
all.
Why would you want to use htrees? Because they speed up large
directories. A lot. Try creating 400,000 zero-length files in a
single directory. It will take under 30 seconds with htree enabled,
and well over an hour without.
> > The good news is that this particular optimization of sorting by inode
> > number should work for all filesystems, and should speed up xfs as
> > well as ext2/3 with HTREE.
>
> What about ext[23] without htree? Mucking with the order returned by
> readdir() has historically caused problems there...
It'll be fine; in fact, in some cases you'll see a slight speed up.
The key is that you'll get the best performance by reading/modifying
the inode data structures in sorted order by inode number. This way,
you make a single sweep through the inode table, without needing any
extraneous seeks. Using the natural sort order of readdir() on
non-htree ext2/3 systems mostly approximated this --- although if
files are deleted and created from the directory, this is not
guaranteed. So sorting by inode number will never hurt, and may help.
- Ted
Reply to: