Re: Bug#481134: libpoppler does not use cmap files from xpdf-{japanese,...}, and fails to parse Japanese PDF files.
- To: Junichi Uekawa <dancer@netfort.gr.jp>, 481134@bugs.debian.org, Hideki Yamane <henrich@debian.or.jp>
- Cc: debian-devel@lists.debian.org, Junichi Uekawa <dancer@debian.org>, control@bugs.debian.org
- Subject: Re: Bug#481134: libpoppler does not use cmap files from xpdf-{japanese,...}, and fails to parse Japanese PDF files.
- From: Loïc Minier <lool@dooz.org>
- Date: Wed, 20 Aug 2008 16:06:40 +0200
- Message-id: <[🔎] 20080820140640.GA22301@fox.dooz.org>
- Mail-followup-to: Junichi Uekawa <dancer@netfort.gr.jp>, 481134@bugs.debian.org, Hideki Yamane <henrich@debian.or.jp>, debian-devel@lists.debian.org, Junichi Uekawa <dancer@debian.org>, control@bugs.debian.org
- In-reply-to: <[🔎] 20080808003433.7e89cedd.henrich@debian.or.jp> <87skwlzrvs.dancerj%dancer@netfort.gr.jp>
- References: <[🔎] e13a36b30807311827t55eeb18dm5d7528f73339f7c1@mail.gmail.com> <[🔎] 20080808003433.7e89cedd.henrich@debian.or.jp> <87skwlzrvs.dancerj%dancer@netfort.gr.jp>
clone 481134 -1
reassign 481134 xpdf-japanese
retitle 481134 xpdf-japanese Should register fonts in fontconfig and/or defoma and/or provide poppler specific symlinks
retitle -1 poppler doesn't support xpdf config anymore
severity -1 important
stop
Heya,
Executive summary: downgrading RC-ness against poppler, the burden of
the fix is probably on xpdf-japanese to start with. Request for
comments/help on fontconfig/defoma topics.
On Wed, May 14, 2008, Junichi Uekawa wrote:
> xpdf-japanese installs CMAP files in
> /usr/share/fonts/cmap/adobe-japan1 etc, but libpoppler looks at
> /usr/share/poppler.
> I need to install the following symlinks in order to use the adobe
> CMAP files.
> $ ls -l /usr/share/poppler/cidToUnicode/Adobe-Japan1
> lrwxrwxrwx 1 root root 34 2008-01-18 19:53 Adobe-Japan1 -> /usr/share/fonts/cmap/adobe-japan1
> $ ls -l /usr/share/poppler/cMap/
> lrwxrwxrwx 1 root root 50 2008-01-18 19:53 /usr/share/poppler/cidToUnicode/Adobe-Japan1 -> /usr/share/xpdf/japanese/Adobe-Japan1.cidToUnicode
(the symlinks are flipped)
I confirmed that adding similar symlinks allows display of Japanese PDF
files with evince (poppler). I don't think adding the symlinks in
poppler is a good idea.
What happens here is that xpdf-japanese installs a xpdfrc config file
snippet which will properly configure unicodeMap and cMapDir with
various files for xpdf, see /etc/xpdf/xpdfrc-japanese:
#----- begin Japanese support package (2004-jul-27)
cidToUnicode Adobe-Japan1 /usr/share/xpdf/japanese/Adobe-Japan1.cidToUnicode
unicodeMap ISO-2022-JP /usr/share/xpdf/japanese/ISO-2022-JP.unicodeMap
unicodeMap EUC-JP /usr/share/xpdf/japanese/EUC-JP.unicodeMap
unicodeMap Shift-JIS /usr/share/xpdf/japanese/Shift-JIS.unicodeMap
cMapDir Adobe-Japan1 /usr/share/fonts/cmap/adobe-japan1
toUnicodeDir /usr/share/fonts/cmap/adobe-japan1
...
poppler probably used to parse xpdf's config as it was forked from
xpdf.
Nowadays, poppler has these /usr/share/poppler/cidToUnicode and /cMap
dirs (along others) to allow for these mappings; these rely on the
filenames for the mappings to work, so e.g. if poppler needs
Adobe-Japan1's mapping, it will look for Adobe-Japan1 in there. The
actual data for the Adobe-Japan1 font is available in
/usr/share/fonts/cmap/adobe-japan1 and
/usr/share/xpdf/japanese/Adobe-Japan1.cidToUnicode, but there isn't any
place with the data available with the expected file names
"Adobe-Japan1".
My understanding is that xpdf-japanese is very xpdf specific; it does
install some files below /usr/share/fonts/cmap, but I'm not sure this
is used by anything except xpdf. It only calls update-xpdfrc in
postinst.
As I see it, the modern way of dealing with fonts is to use fontconfig
which poppler supports, but I'm not sure fontconfig provides support
for cMap and cidToUnicode information. A Debian alternative for
dealing with fonts is to use defoma which allows for package specific
scripts to be run when fonts are registered/removed via defoma. Some
defoma scripts mention cmap, but I saw no cidToUnicode references there
either.
So because the data is not currently available in any place I could
find / think of or isn't in the proper format, I see the following
solutions to fix this bug:
a) make poppler parse xpdf's config again; I'm not too hot on this as
it makes poppler leak that it's xpdf based, diverges from upstream,
and isn't a sustainable long-term option; I'm not even sure it's
still possible
b) make xpdf-japanese register fonts properly in fontconfig; I don't
have enough fontcp,fog foo to understand whether fontconfig can
carry the relevant information/mappings though; help to write a
/etc/fonts/conf.d/*xpdf-japanese*.conf is welcome; it's not clear to
me whether such mappings are reaching too far for fontconfig's goal;
this might also require poppler patching to lookup the relevant
fontconfig data if poppler doesn't use this part of fontconfig, or
if we had new API for this stuff to fontconfig; probably a good
solution for other packages in the same solution as well
c) make xpdf-japanese register fonts with defoma and make poppler run a
defoma script to generate appropriate symlinks to the cidToUnicode
and cMap files/dirs; I don't have enough defoma foo to understand
whether it's already possible in defoma, but I would expect it is;
this has the benefit of making it easy to handle other packages in
the same case; help welcome to answer the open issues here
d) make xpdf-japanese generate /usr/share/poppler/{cMap,cidToUnicode}/*
symlinks; probably dead easy, but makes the solution poppler
specific: other fontconfig or defoma using apps wont benefit from
the fonts/mappings; also leaves open the question of integration:
should it be a new poppler-japanese package? who should be pulling
this package?
For the long term, I'd recommend either b), or c) if b) isn't possible;
for lenny, either the long term approach or the d) approach would work,
but d) would only solve the problem for this particular package.
Because b, c), and d) will almost certainly require changes to
xpdf-japanese, I'm reassigning the RC counterpart of this bug to this
package. I'm keeping an important bug open against poppler because it
can be considered a regression of poppler, but I don't think it should
be considered RC for poppler because:
* xpdf-japanese is non-free while poppler is free; perhaps this is a
fallacy, but I don't think support for a non-free package should
affect the Debian release; if this is considered >= serious
severity, it shouldn't be release-blocking
* it's an upstream design decision that poppler doesn't support xpdf
config/data and poppler doesn't aim to be compatible with xpdf's
config/data: xpdf-japanese needs to grow compatibility with poppler
instead
Cheers,
--
Loïc Minier
Reply to: