Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š

To: submit@bugs.debian.org
Subject: Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
From: Petter Reinholdtsen <pere@hungry.com>
Date: Sat, 15 Aug 2020 11:31:14 +0200
Message-id: <[🔎] sa6imdk2r31.fsf@hjemme.reinholdtsen.name>
Reply-to: Petter Reinholdtsen <pere@hungry.com>, 968437@bugs.debian.org

Package: xindy-rules
Version: 2.5.1.20160104-5
Severity: important
Tags: patch upstream

Dear xindy-rules maintainers,

I ran into this problem when using dblatex and xindy to typeset a book,
where the index ended up with the wrong sorting order.  This is a
Norwegian book with some North Saami words in the body and index.  Every
Saami word starting with č and š are incorrectly sorted as starting with
a symbol, while they should be sorted with c and s, respectively.

Setting severity to important, as there is no known workaround and the
problem is fatal when trying to create a print ready book using xindy.

I had a look at the code, but do not really know how this is supposed to
work.  I suspect the correct fix is the untested patch below.  Am I on
the right track here?  I verified the ordering of č and ç by comparing
it with the nb_NO locale.

diff --git a/make-rules/alphabets/norwegian/utf8.pl.in b/make-rules/alphabets/norwegian/utf8.pl.in
index 902b07b..9b30a88 100644
--- a/make-rules/alphabets/norwegian/utf8.pl.in
+++ b/make-rules/alphabets/norwegian/utf8.pl.in
@@ -11,10 +11,9 @@ $alphabet = [
                    [], # a with ogonek (polish)
 ['B',  ['b','B']],
                    [], # b with hook (hausa)
-['C',  ['c','C'],['ç','Ç']],
+['C',  ['c','C'],['č','Č'],['ç','Ç']],
                    [], # ch (spanish/traditional)
                    [], # cs (hungarian)
-                   [], # c with caron (many)
                    [], # c with acute (croatian, lower sorbian, polish)
                    [], # c with circumflex (esperanto)
                    [], # c with cedilla (albanian, kurdish, turkish)
@@ -85,10 +84,9 @@ $alphabet = [
                    [], # r with caron (czech, slovak/large, upper sorbian)
                    [], # r with acute (lower sorbian)
                    [], # r with cedilla/comma (latvian)
-['S',  ['s','S']],
+['S',  ['s','S'], ['š', 'Š']],
                    [], # sh (albanian)
                    [], # sz (hungarian)
-                   [], # s with caron (many)
                    [], # s with acute (lower sorbian, polish)
                    [], # s with circumflex (esperanto)
                    [], # s with comma below (romanian)

-- 
Happy hacking
Petter Reinholdtsen

Reply to:

Follow-Ups:
- Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
  - From: Petter Reinholdtsen <pere@hungry.com>
- Processed: Re: Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
  - From: "Debian Bug Tracking System" <owner@bugs.debian.org>
- Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
  - From: Norbert Preining <norbert@preining.info>
- Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
  - From: Bruno Haible <bruno@clisp.org>

Prev by Date: dvisvgm_2.10-1_i386.changes ACCEPTED into unstable
Next by Date: Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
Previous by thread: dvisvgm_2.10-1_i386.changes ACCEPTED into unstable
Next by thread: Bug#968437: xindy-rules: Incorrect Norwegian sorting of č and š
Index(es):
- Date
- Thread