[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: IDEA to SERIOUSLY reduce download times!



On Thu, Jul 08, 1999 at 04:12:05PM +0200, Ingo Saitz was heard to say:
> MoiN
> 
> >   Yes, the configuration files are put directly into shipping.  I overlooked
> > file permissions, thanks for pointing that out :-)  I may have to switch to
> > a real programming language for this, unless I can find a program to copy
> > one file's permissions to another.
> 
> Its near: man chmod
> 
> CHMOD(1)                       FSF                       CHMOD(1)
> [...]
> OPTIONS
> [...]
>        --reference=RFILE
>               use RFILE's mode instead of MODE values
> [...]
> 

  Heh, I looked in chmod's documentation and somehow missed this.  Thanks! :-)

> >   I'm also trying to come up with a clever way of detecting when files simply
> > moved [to avoid including the whole file in the patch].  No luck so far :)
> 
> No luck with checksumming? Assuming you have the contents of both packages
> in ./old and ./new, do a "md5sum `find . -type f` | sort" and look for the
> first field, if it is equal, the second filed, if one file is below /old
> and the other below /new. If so, you might want to double check with
> cmp(1) if they _are_ equal. I did that yesterday in one line ~200
> caracters (without the cmp and for one directory only): (from my mind)
> 
> md5sum `find -type f` | sort | ( 
>    osum=""; ofile="";
>    while read sum file;
>    do
>      if [ "$osum" = "$sum" ]; then
>        echo "Files are equal: $ofile $file";
>      fi;
>      osum="$sum"; ofile="$file";
>    done;
> )

  Yes..the main thing I was concerned about was the time this takes (hence
my comment about finding a 'clever' way, at least more clever than the
O(n^2) approach of comparing everything to everything else), and I hadn't
thought about it for more than a few seconds. :) Sorting is of course the right
way to do it..we also need to work out where to store it in the patch (I
suppose the obvious thing is to shove something into control.tar.gz but
that's probably not so good; maybe just append a text file to the main
archive of the format:
  [old file location] [new file location] )
  It would also be nice to find 'similar' files that are in different places
(eg, if a library is recompiled with a minor change *and* moved around) --
can md5sums be used for this, or is there a simple way to do it?  (aside
from xdelta'ing all files that have similar sizes with one another :) )

> I have one more consideration about the space the patches would occupy. if
> you are updating your distribution on a regular basis you don't get into
> problems. But if you miss some updates, the servers have to keep all
> patches online and you have to figure out if you should download the
> patches ro the full archive. 

  Actually, Fabien had an excellent suggestion about how to deal with this.

  Instead of storing patches for all versions, choose one 'canonical' version
and store a patch against it.  When a patch is applied on the user's computer,
a reverse patch (to restore the original version) will also be generated and
stored in a location to be determined :-)  This way, upgrading to the most
recent version requires downloading either one patch (if the canonical version
is unchanged) or a .deb file and a patch, or simply a .deb file (depending
on whether the most recent version is stored both as a patch and as a .deb) and
only one patch need be stored on the server at a time.  This is similar to the
way source packages are handled, with .orig.tar.gz files.

  In other words, say I have a padckage foo, version 1.0-1 .  It would be
uploaded as foo_1.0-1_arch.deb .  When I change the AUTHORS.gz file (to
fix a typo in my name :) ), I upload a 3KB patch, foo_1.0-2_arch.deb-diff and
perhaps a separate patched deb .  (actually, several possibilities have been
brought up, including having dinstall automatically generate the patch but I'll
use this for the sake of simplicity :) )  To upgrade, users can download the
patch.  If I make another minor change, foo_1.0-3_arch.deb-diff will be
uploaded, *replacing* the foo_1.0-2_arch.deb-diff file.  Users can download
the patch to upgrade, but when it is applied they will first have to apply a
reverse patch from 1.0-2 to 1.0-1 to the temporary tree (none of this
patching happens in the live filesystem, it uses a directory in /tmp)
 -- apt should do this automatically.

  On the other hand, foo version 1.1-1 will be uploaded only as a .deb and
become the new 'canonical' version.

>     Ingo
> --
>   c.   Stimm gegen SPAM! Vote against SPAM! Votez contre le SPAM!
>  (`)              Vota contro lo SPAM! Stem tegen SPAM!
>  _<                http://www.politik-digital.de/spam/
> Spam, Spam, Spam, Spam, Spam, Spam, Spam, Spam, Spam, Spam, Spam, Spam,...
> 					-- Monty Pythons Flying Circus

  :-)

  Daniel

-- 
  After the game, the king and the pawn go in the same box.
    -- Italian proverb


Reply to: