[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: VistA packaging : vista-foia : First review



Update on the exploration:
=====================


I have been down to the Dark Forest of Perl, and back...    :-)



After trying many combinations of 

             git-import-orig     --pristine-tar
             git-buildpackage    --pristine-tar


Decided to go down to the source code of git-package
and to track the process.


So, I cloned:

git://honk.sigxcpu.org/git/git-buildpackage.git 

and followed the Python routines.

which led me to intercept

             /usr/bin/xdelta

and 

           /usr/bin/pristine-tar




xdelta
---------

Is the application that produces a "diff" as a "binary patch"
between two (binary) files, and can later patch one to make
it match the other.



pristine-tar
---------------

Is written in Perl, and does all the work of generating at tar
(uncompressed) file from the working tree of the git repository, 
as well as decompressing the tar.gz file that is passed as argument.
(via git-buildpackage or git-import-orig).

The generated tar file is produced by

                             /usr/bin/pristine-tar

in lines:

373 sub recreatetarball_helper {
374         my %options=@_;
375         my $tempdir=$recreatetarball_tempdir;
376 
377         my $ret="$tempdir/recreatetarball";
378         system("cp -r $tempdir /tmp/pristinetempdir_justbeforetar");
379         my @cmd=($tar_program, "cf", $ret, "--owner", 0, "--group", 0,
380                         "--numeric-owner", "-C", "$tempdir/workdir",
381                         "--no-recursion", "--mode", "0644",
382             "--files-from", "$tempdir/manifest");
383         if (exists $options{tar_format}) {
384                 push @cmd, ("-H", $options{tar_format});
385         }
386 
387         doit(@cmd);
388 
389         return $ret;
390 }


After, intercepting the files with line 378 (inserted),
got a glance at the temporary working directory that
is used to generate the tar file.

In that temporary directory I manually recreated the
tar file with the same command:

tar cf luisrecreated.tar --owner 0 --group 0  --numeric-owner  -C pristinetempdir_justbeforetar/workdir/  --no-recursion  --mode 0644  --files-from pristinetempdir_justbeforetar/manifest 


and then again without the mode 0644, 
since it breaks the permissions for navigating 
down directories.

tar cf luisrecreated2.tar --owner 0 --group 0  --numeric-owner  -C pristinetempdir_justbeforetar/workdir/  --no-recursion    --files-from pristinetempdir_justbeforetar/manifest 


Compared those generated tar files to the original,
with xdelta, and in all cases end up with a file that
was as large as the original .tar files.

Until this last ominous experiment:

Doing the delta of the tar file with itself:

xdelta delta -0 --pristine luisrecreated2.tar luisrecreated2.tar luis2tarballdelta

and getting the surprise of finding the delta size to be:

                        2938112155    luis2tarballdelta

for a tar file of size

                         2938112000    luisrecreated2.tar

when comparing the file to itself....

an operation that I would have expected 
to return an almost empty delta file...



Now... using the xdelta info option, we look into this
delta file, and get:

xdelta info luis2tarballdelta 
xdelta: version 1.1.3 found patch version 1.1 in luis2tarballdelta
xdelta: output name:   luisrecreated2.tar
xdelta: output length: -1356855296
xdelta: output md5:    a9377f5d550845baaffd0ac3978aded5
xdelta: patch from segments: 1
xdelta: MD5 Length Copies Used Seq? Name
xdelta: a9377f5d550845baaffd0ac3978aded5 -1356855296 1 -1356855296 yes (patch data)



Where a negative number in the "length" field,... 
raised eyebrows...


The decimal value in the delta summary:

−1356855296

corresponds to hex

−50DFF800

that in 2 complement is

  AF2007FF

which is hex for the decimal value:

   2938111999

That is the length of our initial beloved tar file,
minus one.

    2938112000



Ok,... 
a lot of drama to say that:


It looks like "xdelta" has a 32-bits truncation bug,
and can't deal with a tar file of the size that we
are presenting to it.


Our tar file size is larger than 2^31, and if xdelta
uses 32-bits signed integers to represent the file
length, (and it looks like it does), the value wraps
around into a negative number.



I'm now looking at the source code of xdelta.


Is this the right home for the project ?:


https://code.google.com/p/xdelta/


the version of xdelta in sid is:


                                  1.1.3

while upstream seems to be in a version 3,

and as features, it advertises:


"Xdelta release version 3 supports VCDIFF encoding and decoding. 
  Supports compressing 64 bit files on Windows, Linux, etc."




  I'll appreciate any hints..


        Regards,


              Luis


Reply to: