[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ps2pdf: why differences in different instances using unchanged PS source file



On 2020-12-13 at 11:40, Tom Browder wrote:

> I have been using ps2pdf for many years on a PS source file for a
> personalized calendar for my wife. This year I have been cleaning up
> my old Perl generating code (in preparation for converting it to Raku
> [https://raku.org]) and noticed I am getting a different pdf output
> for each run, even when the PS output source file is unchanged!
> 
> I have looked at the detailed options for ps2pdf and don't see
> anything that stands out to me when using the default settings.
> 
> I attempted a comparison of two different pdf outputs by running "od
> -c file.pdf" on the two outputs but cannot identify anything familiar
> to me about the differences.

What observation leads you to notice that the files are different?

> My command that runs the actual process is this:
> 
>     $ ps2pdf  -dAutoRotatePages=/None  file.ps  file.pdf
> 
> I am running on Debian Buster with current updates.
> 
> Any suggestions to produce constant output pdf for constant ps input
> are appreciated.

I ran this command on a random .ps file from a package I have installed,
to two output files.

cmp verified that the files were different.


Running vbindiff on the files shows that the and second differences are
in "ModifyDate" anbd "CreateDate" tags; each file appears to contain
multiple timestamps, and since the files were not generated at the same
second, those timestamps are going to be different.

The third difference is a UUID; that's intended to be globally unique,
and I'm guessing they intentionally generate a new one per PDF so that
they can be told apart.

The fourth difference is a CreationDate label, again with a timestamp.

The fifth and final difference, right before the end of the file, is an
unintelligible string of hex digits. My first suspicion is that it's a
hash of the rest of the file, to make validity checking possible.

So it doesn't look like the differences are meaningful, but they do
exist.


ps2pdf appears to be part of ghostscript. Neither its man page nor the
one for 'gs' (which it references) seem to contain any options for
setting these timestamps, preventing their creation, or doing similarly
with that UUID.


The Debian reproducible-builds project has a page on eliminating this
type of "multiple runs on the same input don't produce identical output"
issue more generally[1], which links to a page on doing so with PDFs
generated by Ghostscript specifically[2].

I'm not enough of an expert to follow that much further, but at a
glance, it looks like the solution is to patch Ghostscript; the patch
doesn't seem to have been accepted upstream, and it's not clear whether
or not it's (going to be) distributed in the Debian package.

If you're not OK with handling that on your own end, then you may be out
of luck with this.

[1] https://wiki.debian.org/ReproducibleBuilds/Howto
[2] https://wiki.debian.org/ReproducibleBuilds/PdfGeneratedByGhostscript

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man.         -- George Bernard Shaw

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: