Bug#1013946: lintian: wrongly report unknown-locale-code ber
Control: tag -1 + help
Hi Russ,
Russ Allbery wrote:
> > But upon deeper inspection I found that this is likely not an issue in
> > iso-codes as "ber" is correctly not in
> > /usr/share/iso-codes/json/iso_639-3.json but in …/iso_639-2.json and
> > …/iso_639-5.json as it is a code for a language group. (Which kinda
> > makes it suspicious for me to be used in locales. But then again I'm
> > not a linguist.)
>
> Sorry, I followed up on the bug and forgot to explicitly cc Lintian
Not needed. I got the message via the lintian ML / maintainer address.
(Somehow I though didn't get my own messages to that bug report back
via the list.)
> I worked out the same thing, and I'm fairly sure that means that this is
> not a valid locale. It's the code for the Berber language *group*, and
> the individual members of that group have their own 639-3 codes, so that
> seems to imply to me that those translations were tagged with the wrong
> code.
Yep, I also noticed that. I'm just not sure where exactly the border
between just a group of languages, which has no common grounds to be
spoken anywhere, and a group of very similar languages, which likely
can be understood by members of another language from the same group
and maybe even have a common written language, is.
Toddy may indeed have some more input for us here.
> Fabio also followed up and noted that there are a few translations for ber
> in Launchpad, but they're all partial and probably not usable.
Ok, I didn't get that mail. So maybe I really didn't get your initial
mail, just another mail from you to the bug report. :-)
> Tobias probably knows more, as iso-codes maintainer, but my guess is that
> this is a mistake on the Launchpad side and those translations should be
> for one of the specific languages of the group rather than being coded to
> the 639-5 language group code. I think Lintian should still continue to
> use 639-3.
>
> That said, I'll leave it to you to decide if you want to hang on to the
> bug or not. :)
Thanks for your input here. Actually that variant so far was my second
choice (the stricter one) so far. See the very end of that one long
mail from me. :-)
Anyway, JFTR: I just looked at how lintian in Debian Stable (i.e.
2.104.0 in Bullseye) does the locale code lookup. It had it's own data
file for that (and hence now using iso-codes is good as it is no more
duplicating these 33kB of data) and that file
(/usr/share/lintian/data/files/locale-codes) states:
# List of locale codes. This is derived from the ISO 639-1, ISO
# 639-2, and ISO 639-3 standards.
And indeed, "ber" was in that file.
So previously lintian did use ISO 639-1, 639-2 and 639-3.
So using just ISO 639-3 was either an accident, on purpose or a
regression and has been introduced when lintian was switching to
iso-code's files as data source in commit
https://salsa.debian.org/lintian/lintian/-/commit/fcaded19
Unfortunately this commit was tagged "Gbp-Dch: ignore" in git
(why?!?), so it didn't appear in debian/changelog. *grrrr* (I may
retroactively add it to the debian/changelog entry of 2.115.0 like I
already added the item about switching to Text::Glob which also caused
bugs.)
Anyway, with you proposing a more strict checking here and I was at
least initially proposing to get back to the more laxer parsing used
previously, it would be really good to have some additionaly input
from someone with a bit more experience on that topic. I hope that
Toddy can provide that. :-)
Tagging as help for that reason.
Regards, Axel
--
,''`. | Axel Beckert <abe@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Reply to: