[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: A success story with apt and rsync

On Sun, 6 Jul 2003, Andrew Suffield wrote:

> It should put them in the package in the order they came from
> readdir(), which will depend on the filesystem. This is normally the
> order in which they were created, and should not vary when
> rebuilding. As such, sorting the list probably doesn't change the
> network traffic, but will slow dpkg-deb down on packages with large
> directories in them.

Yes, when saying "random order" I obviously ment "in the order readdir()
returns them". It's random for me.  :-)))

It can easily be different on different filesystems, or even on same
type of filesystems with different parameters (e.g. blocksize).

I even think it can be different after a simple rebuild on exactly the
same environment. For example configure and libtool like to create files
with the PID in their name, which can take from 3 to 5 digits. If you
create the file X and then Y, remove X and then create Z then it is most
likely that if Z's name is shorter than or equal to the length of filename
X, then it will be returned first by readdir(), while if its name is
longer, then Y will be returned first and Z afterwards. So I can imagine
situations where the order of the files depend on the PIDs of the build

However, I guess or goal is not only to produce similar packages from
exactly the same source. It's quite important to produce similar package
even after a version upgrade. For example you have a foobar-0.9 package,
and now upgrade to foobar-1.0. The author may have completely rewritten
Makefile which yields in nearly the same executable, the same data files,
but completely different "random" order.

However, I think sorting the files costs really nothing. My system is not
a very new one, 375MHz Celeron, IDE disks, 384MB RAM etc... However:

/usr/lib$ du -s .
1,1G    .
/usr/lib$ find . -type f | wc -l  # okay, it's now in memory cache
/usr/lib$ time find . >/dev/null 2>&1

real    0m0.285s
user    0m0.100s
sys     0m0.150s
egmont@boci:/usr/lib$ time sortdir find . >/dev/null 2>&1

real    0m1.683s
user    0m1.390s
sys     0m0.250s

IMHO a step which takes one and a half seconds before compressing 18000
files of more than 1 gigabytes shouldn't be a problem.


Reply to: