[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1054218: texlive-latex-base: pdflatex failures on big-endian architectures (s390x)



Package: texlive-latex-base
Version: 2023.20231007-1
Severity: normal
X-Debbugs-Cc: stuart@debian.org

Dear Maintainer,

The unittests of the 'plastex' package run pdflatex to generate some
figures, and then extract the text from the figures to verify that
various implementation details of the package are working. These tests
pass on all release architectures except s390x. They also fail on ppc64.
The common feature of the failures is that the architecture is
big-endian.

The failures are all similar to:

  AssertionError: 'hi' != '\x00\x00'

i.e. the text that is found in the PDF (either by gs or pdftotext) is
the same number of bytes as the original text, but is all \0. The
extraction is platform-independent — the attached s390x.pdf yields \0\0
for its text no matter what arch pdftotext or gs is run on.

The PDFs all _look_ OK in any PDF viewer, it's just the text extraction
that fails.

If the pdf is generated via latex followed by dvipdf then the extracted
text is correct (up to whitespace); if the pdf is generated by lualatex
then he extracted text is correct.

It seems that pdflatex is mishandling embedding the text on big endian
systems. Speculating wildly... it looks a bit like pdflatex is taking
the wrong byte out of a multibyte character representation, and ending
up with \0 rather than the byte of interest, but I don't know how
pdflatex is representing the characters internally or how it is encoding
them into the PDF.

While I don't expect that there are many direct users of pdflatex on s390x,
testing migration within Debian now requires successful completion of
unittests on s390x, and so arch-specific bugs on s390x become relevant.

Attached:
  test.tex (one of the little .tex files plastex generates in its tests)
  amd64.pdf (output of "pdflatex test.tex" on amd64)
  s390x.pdf (output of "pdflatex test.tex" on s390x)

(access to s390x and ppc64 courtesy of Debian's porter boxes
zelenka.debian.org and perotto.debian.net)

regards
Stuart

Attachment: amd64.pdf
Description: Adobe PDF document

Attachment: s390x.pdf
Description: Adobe PDF document

\nonstopmode\AtBeginDocument{\thispagestyle{empty}}\documentclass{article}\usepackage{microtype}\DisableLigatures{encoding = *, family = *}\begin{document}\newif\iffoo\footrue\iffoo hi\else bye\fi\end{document}

Reply to: