Hello Ian, answering at least partially from a translators and package maintainers view, who also i18n his programm (albeit on a much smaller scale). On Wed, Sep 12, 2018 at 09:48:56PM +0100, Ian Jackson wrote: > Hi. I hope this is a suitable list for my question. If not, please > direct me elsewhere... At least some experts are on this list, so I belive it is a good start. > I am doing the i18n for a package (src:dgit) which I think it will be > useful to translate (at least, much of it). It's a Debian native > package containing mostly perl scripts. > > I'm not sure of the best approach. My main questions: > > > 1. There doesn't seem to be any standard set of Makefile machiner to > include, or anything. Do I really have to write my own make rules to > run xgettext etc. ? I looked at the debconf source package, which > seemed like it would be a good example, and it had its own rules. I > can write my own rules if that is best; they're not huge. It just > seemed a bit wheel-reinventish. You might have a look at dpkg / apt as well, but I agree that there does not seem to be a plug in ready to use make file. But on the other hand I did not find po4a overly complicated and looking at good examples made me write the necessary files rather quickly, especially using the man pages of po4a. > (NB that I don't want to instroduce use of automake into what is > currently a simple "upstream" Makefile; if it comes to that I would > prefer just to write my own rules for this.) My most work come from exactly this, so your use case appears even simpler. > 2. I am unsure of the best layout of the .pot, .po, po4a, etc., files. > > The convention I saw in src:debconf was to have a directory `po' > containing a single `debcconf.pot', all the message translations > LANG.po, and the corresponding Makefile and script machinery. I > dislike the idea of mixing up files edited by translators with make > machinery, but I can tolerate it if it's conventional and would > disturb people if I did it differently. This is the standard layout. If a new translators picks appears (s)he will look for the pot file, copy it over to, say de.po, and start working on it. (I'm not an expert on tools like weblate, but they probably do similar things for the online interfaces). > In src:debconf I also saw po4a in use. The translations were all in > doc/man/po4a/po/LANG.po, and there was also > doc/man/po4a/add_LANG/addendum.man.LANG. This all seemed a bit ad > hoc. > Is there a standard layout ? See also my next question, which may > influence the answer to this one. This is the way it is designed, but you can do it differently, e.g. dpkg has a single man/po directory with all .po and .add files in it. > Relatedly, how do automatic translation coversge tools (we have those > I think?) deal with the variety of different possible layouts ? Hopefully people with background on our i18n machinery can answer this in more detail. > 3. I am not sure how to divide up my translation inputs (pot files). > > My single source package generates two binary packages. The two > binary packages are rather different; they perform different roles > (although they work well together) and have different (but > overlapping) audiences. > > This might reasonably influence the way the messages from the two > packages (really, the two programs) are translated. So maybe I should > have two .pot files for the two sets of messages. > > But the programs share a small set of library code. The library code > does not have many messages, but there are some. These messages > should be translated only once. So if I split it up, there would have > to be *three* .pot files for messages: dgit, git-debrebase and common. From a translators POV try to avoid too many pot files. Usually translation teams are understaffed, so if you really must split the files, do so by importance and label the "level 1", "level 2" or "prio 1" and "prio 2". This will guide translators. But if you want consistency, have a few (or even a single file) might be best. > (I think I can use tools like xgettext and msgcat, with appropriate > make runes, to handle any arbitrary organisation of .pot files that I > decide on.) Yes. > The need for splitting up is perhaps more acute for the documentation. > I will use po4a for that. (po4a has a powerful system for handling > almost arbitrarily strange layouts.) > > The git-debrebse package has its own data model and conceptual model, > and its documentation is carefully written to talk about that in the > right terms. Additionally, perhaps it is useful for a translator to > know whether a string they are translating is part of a reference > manual or a tutorial. Giving context information to translators is always good. You can add annotations via po4a, so guiding translators is appreciated. > But src:debconf does not split like this so maybe it is not useful ? > Or maybe it is even harmful because it might involve duplicating > certain "framework" parts or something ? Try to avoid duplicating strings, this is really bad for translators. Splitting translations might make sense, as it is usually much larger. If a translator encounters a 100 string file for an important tool, (s)he might start and finish much quicker (including review on the translation list) than say for a 500 string file. > 4. Terminology in translations. > > As I say, one of the two packages has a specific conceptual model. > Yhat has its own terminology, which is defined in a section 5 manpage. > It is important that if and when this is translated, thought is given > to what translated names to give for each of the English terms; and > that this settled terminology is then used consistenty throughout all > of the documentation. > > Also, the terminology appears, in some cases, as protocol elements > (which are in text and amy be displayed to the user). These obviously > cannot be translated or things will break. So I think, ideally, when Add hints to translators, what to translate. Please note, however, if your programm gets i18n, some strings might get translated, e.g. if you query a user with a yes/no question in this case the answer might be in the language of the user. > the terms are defined in the section 5 manpage, the English words > should be stated alongside the translated ones. I like the idea. > Can I (should I) leave a note to translators about these issues ? > The relevant documents are in perl pod format. Yes, please do. > 5. Translation priority > > Obviously translators are volunteers and will work on what they feel > is most important. But I think some parts are much more important to > translate than others: > > These tools, particularly dgit, are useful within Debian but also, > IMO, extremely useful outside it. Different people will use it in > different ways. > > This is reflected in the documentation. Some of the documentation is > aimed at users and downstreams; whereas some is aimed primarily at > Debian maintainers for whom it is less important to have translations > since much of the rest of their work has to be done in English. > > Is there a sensible way to inform translators about this kind of > thing, so that they can spend their time wisely ? I think maybe I > would like to tag some documents as high, medium, or low priority, or > something. If you annote the initial strings with this information ("the target audience of this is a Debian developer / a random user") then I belive translation teams can and will handle the priority by themselfs. But note that some translator simply like the programm and will translate everything irrespective of priority. > 6. Committing the .pot file > > AFAICT it is conventional for the .pot file(s), generated > automatically from the source code with xgettext, to included in > source packages, git repos, etc. > > That seems odd. What is the reason for this ? Can I sensibly diverge > from this and expect translators etc. to run a build rune to get the > .pot files ? Please don't. Translators are not developers. They usually will not run any build tools. Also online programs, l10n monitor scripts etc. will usually not work without the pot file. Simply update the pot file whenever your strings change, the rest is covered in the traditional translation machinery. This might not be the best way, but this is the working way. > I was surprised not to find answers to my questions in the > documentation for gettext, etc. Am I missing some best practice > guide ? > > All advice and opinions gratefully appreciated. I hope the answers help you. If you need more information, do not hesitate to ask. Greetings Helge -- Dr. Helge Kreutzmann debian@helgefjell.de Dipl.-Phys. http://www.helgefjell.de/debian.php 64bit GNU powered gpg signed mail preferred Help keep free software "libre": http://www.ffii.de/
Attachment:
signature.asc
Description: Digital signature