[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#995392: ghostscript: ps2pdf trashes some characters



On 2021-11-03 03:29:43 +0100, Vincent Lefevre wrote:
> On 2021-11-02 16:25:27 +0100, Vincent Lefevre wrote:
> > With commit 8f62213019bc682eeb0ed9467d8841f3770cfda6 upstream,
> > I can no longer reproduce any issue, even when
> > /usr/share/texlive/texmf-dist/tex/generic/pdftex/glyphtounicode.tex
> > from Tex Live 2020 is included and \pdfgentounicode=1 is used.
> 
> Hmm... I didn't check carefully. On one of my files, there is
> actually one place where the quoteright (used for the apostrophe)
> is replaced by "Š" (checked with pdftotext, xpdf and atril). The
> cause may be that the paragraph in question is in a smaller font.

I have an explanation: it seems that in this smaller font,
no ligatures (ff, ffi, fl...) are used.

In a recent fix, Ghostscript no longer generates a ToUnicode CMap
when there are \pdfglyphtounicode with more than 2 bytes (such as
those used for the ligatures). So this fix made the bug disappear
when ligatures are used. Bug the bug was still there, and visible
when ligatures are not used.

> So the issue is still visible in practice.
> 
> I'll try to produce a simple testcase.

Here is it:

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\pdfglyphtounicode{Scaron}{0160}
\pdfgentounicode=1
\begin{document}
\thispagestyle{empty}
'ê
\end{document}

(Tested on the PDF generated by pdflatex from TeX Live 2020.)

My new upstream bug report:

  https://bugs.ghostscript.com/show_bug.cgi?id=704681

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Reply to: