[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Reproducibility of image building (Re: Debian images on Microsoft Azure cloud)



I think the content of the files created should not differ when building
an image twice in a row with the same package source and parameters.

The packaged timestamps problem needs to not include timestamps or reset
them to 00:00:00... or any other calculated value.



El 23/11/15 a les 02:04, Charles Plessy ha escrit:
> Hi Marcin and everybody,
> 
> about reproducibility:
> 
> Le Sat, Nov 21, 2015 at 03:17:22PM +0000, Marcin Kulisz a écrit :
>>
>> I'm not sure if it's possible to upload image and to build one to make them bit
>> for bit identical for reasons like ex. timestamps on files, etc.. I think that
>> at least some providers are adding some metadate which would change any
>> checksums produced before upload.
> 
> Indeed.
> 
> In this discussion and before, I think that there is a strong consensus that
> there must be some reproducibility in image building, but we have a difficulty
> of translating this in a concrete requirement.
> 
> Requiring that two images built at different times are bitwise identical is not
> realistic, not only because of time stamps, but also because some elements of
> configuration will differ, for instance the location of the package sources.
> 
> Having checksums of all the files on a given image would be nice, but let's
> note that this is not a requirement currently.  At the moment, I think that we
> should not request that the file checksums stay identical over rebuilds in the
> same environments: this would restrict design choices for the image builders
> (on timesamps, logs, etc), and therefore put pressure on the people writing
> them.
> 
> Of course, some of these goals can become standard practice later, but I think
> that this should evolve through consensus involving the people and teams
> developing image builders.  Doing the other way round would be hitting those
> who do the work with a trademark stick, which would be counter productive, so
> put it mildly.
> 
> Altogether, for reproducibility, would the following be acceptable ?
> (Wording, of course, can be improved)
> 
>  * When building an image twice in a row with the same package source
>    and parameters:
>    - the packages installed must be the same;
>    - the files created must be the same;
>    - the content of the files created may differ;
> 
>  * When releasing an image, a list of all the packages installed and a list of
>    checksums of all the files must be provided.
> 
>  * For files which checksums vary, it would be good to provide their list
>    and an explanation on why they vary, although it is not a stict requirement.
> 
> Have a nice day,
> 


Reply to: