Re: GIT for pdiff generation
>> As we are no git gurus ourself: Does anyone out there see any trouble
>> doing it this way? It means storing something around 7GB of
>> uncompressed text files in git, plus the daily changes happening to
>> them, then diffing them in the way described above, however the
>> archive will only need to go back for a couple of weeks and therefore
>> we should be able to apply git gc --prune (coupled with whatever way
>> to actually tell git that everything before $DATE can be removed) to
>> keep the size down.
> AFAIK, there can be trouble. It all depends on how you're structuring
> the data in git, and the size of the largest data object you will want
> to commit to the repository.
Right now the source contents of unstable has, unpacked, 220MB. (Packed
gzip its 28MB, while the binary contents per have each have 18MB
packed).
Lets add a safety margin: 350MB is a good guess for the largest.
A packages file nearly doesnt count compared to them, unpacked its just
some 34mb
> There is an alternative: git can rewrite the entire history
> (invalidating all commit IDs from the start pointing up to all the
> branch heads in the process). You can use that facility to drop old
> commits. Given the indented use, where you don't seem to need the
> commit ids to be constant across runs and you will rewrite the history
> of the entire repo at once and drop everything that was not rewritten,
> this is likely the less ugly way of doing what you want. Refer to git
> filter-branch.
Its the one and only thing I ever seen where "history rewrite" is
actually something one wants to do.
> Other than that, git loads entire objects to memory to manipulate them,
> which AFAIK CAN cause problems in datasets with very large files (the
> problem is not usually the size of the repository, but rather the size
> of the largest object). You probably want to test your use case with
> several worst-case files AND a large safety margin to ensure it won't
> break on us anytime soon, using something to track git memory usage.
Well, yes.
--
bye, Joerg
Some NM:
> FTBFS=Fails to Build from Start
Err, yes? How do you start in the middle?
Reply to: