[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Re: Re: Big data is needed for unit test



Hi

> Can you link to the file we are talking about?
With the authorization of the responsibles of the project, I published the file here [2]
[2]http://goo.gl/53sAzM

this looks a bit weird. I guess this google thing allows you to inspect the
content of zip files?

Yes indeed. I simply uploaded it on Google Drive.

The 178 MiB file in question named md_1.jsonz is not a gzipped json file as the
name suggests but a zip archive which contains a 1.4 MiB json file with
metadata and 470 MiB of what seems to be binary data.

If you want to add more complexity to compress this further and save space, you
could add the md_1.jsonz file to a Debian source package in its *unzipped*
version.  The xz compressor will then be able to compress the data down to 120
MiB (with both, -9 and -9e). During build time you would then zip the content
again (with --compression-method=store for speed because size doesn't matter
now) because I guess your software expects the data in this zipped format and
cannot handle it in unpacked form. This method would allow you to safe another
58 MiB in comparison to just adding the original file. I guess it is up to you
whether you want to do it like that to reduce file size or not because as pabs
already pointed out, there are already source packages in Debian that are
larger than 200 MiB. So this is just an idea :)

It seems me a good idea, but in a first time, I think that use a new orig.tar.gz
is lesser complicated regarding what I want.

But thank you very much for your solution. I take it in note.


Best regards,


Corentin


Reply to: