[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

pkg desc translation: Let's talk about implementation (was: something else)

On Tue, Sep 04, 2001 at 05:40:25PM -0500, Adam Heath wrote:
> On Tue, 4 Sep 2001, Michael Bramer wrote:
> > > A proper solution, at the very least, invovles storing the data in the
> > > foo.deb{control.tar.gz/control} file.
> >
> > gettext is not a hack. Gettext for translations and dpkg use gettext
> > is self for translation. Why re-inventing the wheel?
> gettext can not really be used for this data.

You're a bit short on this point. Why ?

> It needs to be stored, in /var/lib/dpkg/status, as a single file.  This is so
> that dpkg can make safe updates to it.  Trying to sync multiple files is not a
> simple solution.

Letting gettext handle the translation syncronization, and letting dpkg
handle technical syncronization does not seem that bad to me. This can lead
to desyncronisation of the translation sometimes, but mainly in 'unstable',
where it's not critical. Please also note that nobody said there is any
simple solution.

> > I propose to store the translations in a some po.files and store this
> > in foo.deb{control.tar.gz}. But not in the control file.
> They must be stored inline inside status/available.  This is the only sane way
> to implement atomic file updates.

That's not sane either because you will need to reimplement a lot of gettext
in dpkg, with no other benefit than having dpkg reinventing the wheel. (I
mainly think about tracking translation accurary and several language

> > If you store the translation as normal field in the control file (like
> > Description-de: dff) you have outdated translations with the time.
> > And outdated translations is a very big problem.
> zcat Packages_de.gz Packages_jp.gz | dpkg --merge-lang

Oh, I like this form. But what will it do ? I would say that when called
with --merge-lang option, dpkg call msgcat in background to merge the new
translation with the old one, and generate the resulting mo file. What's
your wish for this option ?

> > this make the patch and the patch work. I don't stress the patch and
> > maybe it has one or two bugs. But it work with Descriptions on my
> > system.
> Please stop just applying this to Description fields.  Make it generic.  dpkg
> supports user-defined fields, so this proposal/implementation should as well.

Well, the patch I proposed contain a w_i18n_charfield function, which is
used only for the description in the table in lib/parse.c You're free to use
it for other fields also, but I did not find any.

This lead me to think that we are speaking since almost one month about a
patch nobody reviewed seriously. That's my fault, I submitted it without any
explanation.  Let me explain how it works:

First point, it does not change any status information because I think
gettext is better than any patch I can ever submit, and gettext have its own
"status file" (the mo files).

Second point, the patch is against the cvs version, and won't apply to the
released version.

Third, this patch do not solve the publishing issue. Ie, the mo file have to
be installed for it to work. I don't care if it's using a po file in each
package, concatening them and compiling them on fly, or if you have a
package installing all translation, or if you ask to Martians to send you
this file, that's another point (which also need to be solved, but not with
this patch). 

Then, four outputs are modified:
 1) dpkg --list
 2) the short description of the package in the list of dselect
 3) the long description, at the bottom of the dselect windows, when on a
 4) dpkg --status and --print-avail

For 1-3, the data flow in dpkg and dsellect is classically:
  a) information on disk
  b) information in memory
  c) filtering, sellecting information
  d) preprocessing (wrapping, or isolating short desc)
  e) output to screen

My patch only insert a call to dgettext between (c) and (d). It is a call to
Dgettext and not the regular gettext because the text to translate does not
take place in the dpkg textual domain. So, we've made a new textual domain
called dpkg-desc, and use the function dgettext to allow searching in this
new domain.

I did not put the call between (a) and (b) because these informations get
written on disk sometime. This would be bad to write some translated
information in place of the english one because the different admins of the
machine may speak different languages.

I did not put the call between (b) and (c) to reduce the number of call to
gettext. Even if all gettext translations are cached, the time taken is not
null. And also, this would need to change the filtering process.

I did not put the call between (d) and (e) because gettext want the original
in the exaclty same way than it's in the mo file (so, no wrapping)

See the attached patch for more details on exactly were I've put the
dgettext call.

For the fourth one (dpkg -l and -p), that's a bit more tricky, because dpkg
use the same buffer mecanism when writting db to disk and when outputing
results for end-user. There is a table called fieldinfo (in lib/parce.c)
which explain (among other infos) which function is used to append the field
to the buffer. I wanted the description (and possibly any other field which
need it) being translated when output goes to screen, and not when output
goes to a status file. 

In summary, a field may get translated, depending on two things:
  - the field itself, nobody wants to translate the Conffiles
  - the stream (yes if to screen, no if to status file)
My solution is to add a flag to the stream (represented by the buffer) to
say if this stream have to be translated or not. That is to say, in varbuf.c
I add a 'l10n' field to the varbuf type definition. I also add a function
varbufdol10n, which code is simply:

void varbufdol10n(struct varbuf *v) {
  v->l10n= 1;
Then, for all fields which are to be translated sometimes, I change the
table in lib/parse.c to say that they use the function w_i18n_charfield for
output. For now, I done this only for description because I don't think of
other fields needing that. This function does pretty much the same than the
w_charfield one (which is used so far for description), but it checks if the
stream have to be translated. Iff yes, it calls dgettext on the content of
the field.

Here is the code of the regular w_charfield function:
void w_charfield(struct varbuf *vb,
                 const struct pkginfo *pigp, const struct pkginfoperfile *pifp,
                 const struct fieldinfo *fip) {
  const char *value= pifp->valid ? PKGPFIELD(pifp,fip->integer,const char*) : NULL;
  if (!value || !*value) return;
  varbufaddstr(vb,fip->name); varbufaddstr(vb, ": ");

Here is the code of the new w_i18n_charfield function:
void w_i18n_charfield(struct varbuf *vb,
                 const struct pkginfo *pigp, const struct pkginfoperfile *pifp,
                 const struct fieldinfo *fip) {
  const char *value= pifp->valid ? PKGPFIELD(pifp,fip->integer,const char*) : NULL;
  if (!value || !*value) return;
  varbufaddstr(vb,fip->name); varbufaddstr(vb, ": "); 
  varbufaddstr(vb, vb->l10n ? GETTEXT_DESC(value) : value);  

As you can see, only the line before the last is changed. (GETTEXT_DESC is a
macro calling dgettext when NLS is enabled, and expanding to the text when
not) It does what I explained before (IMHO, of course).

Finally, I add a call to varbufdol10n after initialisation and before
filling of buffer destined to the screen, and I don't change anything for
the ones going in status database files.

The best proof this solution is general is that I designed it for the
--status option, forgeting the --print-avail one, and I just checked, the
second one works. And, no, no status file gets polluted with translations
(also checked).

This patch do work. I use it on a dailly basis for information retrieval. I
even updated my machine one time with it (even if I were anxious concerning
this cvs version of dpkg, anything went well).

So, now, Wichert, Adam, and all other dpkg programmer: I do not claim this
patch is the best possible. I am willing to redo this patch to conform to
the dpkg way of life if I missed something. 

But please, tell me what's wrong with this patch, so that I can correct it.

Thanks for reading 'till the end, Mt.
Index: dselect/pkginfo.cc
RCS file: /cvs/dpkg/dpkg/dselect/pkginfo.cc,v
retrieving revision 1.6
diff -u -r1.6 pkginfo.cc
--- dselect/pkginfo.cc	2001/07/16 13:40:22	1.6
+++ dselect/pkginfo.cc	2001/08/28 14:25:49
@@ -106,7 +106,7 @@
     whatinfovb(_("description of "));
-    const char *m= table[cursorline]->pkg->available.description;
+    const char *m= GETTEXT_DESC(table[cursorline]->pkg->available.description);
     if (!m || !*m) m= _("no description available.");
     const char *p= strchr(m,'\n');
     int l= p ? (int)(p-m) : strlen(m);
@@ -132,6 +132,7 @@
     whatinfovb(_("installed control info for "));
     varbuf vb;
+    varbufdol10n(&vb);
     if (debug)
@@ -148,6 +149,7 @@
     whatinfovb(_("available version of control info for "));
     varbuf vb;
+    varbufdol10n(&vb);
     if (debug)
Index: dselect/pkglist.cc
RCS file: /cvs/dpkg/dpkg/dselect/pkglist.cc,v
retrieving revision 1.13
diff -u -r1.13 pkglist.cc
--- dselect/pkglist.cc	2001/07/16 15:29:03	1.13
+++ dselect/pkglist.cc	2001/08/28 14:25:49
@@ -533,7 +533,7 @@
     return 1;
   if (searchdescr) {
-    const char* descr = table[index]->pkg->available.description;
+    const char* descr = GETTEXT_DESC(table[index]->pkg->available.description);
     if (!descr || !*descr) return 0;
     if (regexec(&searchfsm, descr, 0, NULL, 0)==0)
Index: dselect/pkgtop.cc
RCS file: /cvs/dpkg/dpkg/dselect/pkgtop.cc,v
retrieving revision 1.12
diff -u -r1.12 pkgtop.cc
--- dselect/pkgtop.cc	2001/07/16 13:40:22	1.12
+++ dselect/pkgtop.cc	2001/08/28 14:25:50
@@ -219,7 +219,7 @@
     i= description_width;
-    p= info->description ? info->description : "";
+    p= (info->description ? GETTEXT_DESC(info->description) : "");
     while (i>0 && *p && *p != '\n') { waddnstr(listpad,p,1); i--; p++; }
   } else {
Index: include/dpkg-db.h
RCS file: /cvs/dpkg/dpkg/include/dpkg-db.h,v
retrieving revision 1.21
diff -u -r1.21 dpkg-db.h
--- include/dpkg-db.h	2001/04/23 08:59:02	1.21
+++ include/dpkg-db.h	2001/08/28 14:25:50
@@ -236,6 +236,7 @@
 int varbufprintf(struct varbuf *v, const char *fmt, ...) PRINTFFORMAT(2,3);
 int varbufvprintf(struct varbuf *v, const char *fmt, va_list va);
 void varbufinit(struct varbuf *v);
+void varbufdol10n(struct varbuf *v);
 void varbufreset(struct varbuf *v);
 void varbufextend(struct varbuf *v);
 void varbuffree(struct varbuf *v);
@@ -258,6 +259,7 @@
 struct varbuf {
   size_t used, size;
+  int l10n; /* true if we should translate this varbuf */
   char *buf;
 #ifdef __cplusplus
Index: include/dpkg.h.in
RCS file: /cvs/dpkg/dpkg/include/dpkg.h.in,v
retrieving revision 1.28
diff -u -r1.28 dpkg.h.in
--- include/dpkg.h.in	2001/06/08 22:51:26	1.28
+++ include/dpkg.h.in	2001/08/28 14:25:51
@@ -149,6 +149,7 @@
 # include <libintl.h>
 # define _(Text) gettext (Text)
 # define N_(Text) Text
+# define GETTEXT_DESC(Text) dgettext("dpkg-desc",Text)
 # undef bindtextdomain
 # define bindtextdomain(Domain, Directory) /* empty */
@@ -157,6 +158,7 @@
 # define _(Text) Text
 # define N_(Text) Text
 # define gettext(Text) Text
+# define GETTEXT_DESC(Text) Text
 extern const char thisname[]; /* defined separately in each program */
Index: include/parsedump.h
RCS file: /cvs/dpkg/dpkg/include/parsedump.h,v
retrieving revision 1.1
diff -u -r1.1 parsedump.h
--- include/parsedump.h	2001/07/27 01:55:52	1.1
+++ include/parsedump.h	2001/08/28 14:25:51
@@ -53,7 +53,7 @@
                             const struct pkginfoperfile*, const struct fieldinfo*);
 fwritefunction w_name, w_charfield, w_priority, w_section, w_status, w_configversion;
 fwritefunction w_version, w_null, w_booleandefno, w_dependency, w_conffiles;
-fwritefunction w_filecharf;
+fwritefunction w_filecharf, w_i18n_charfield;
 struct fieldinfo {
   const char *name;
Index: lib/dbmodify.c
RCS file: /cvs/dpkg/dpkg/lib/dbmodify.c,v
retrieving revision 1.12
diff -u -r1.12 dbmodify.c
--- lib/dbmodify.c	2001/06/08 22:51:26	1.12
+++ lib/dbmodify.c	2001/08/28 14:25:52
@@ -179,7 +179,7 @@
   if (cstatus >= msdbrw_write) {
-    uvb.used= 0;
+    uvb.used= uvb.l10n= 0;
     uvb.size= 10240;
     uvb.buf= m_malloc(uvb.size);
Index: lib/dump.c
RCS file: /cvs/dpkg/dpkg/lib/dump.c,v
retrieving revision 1.6
diff -u -r1.6 dump.c
--- lib/dump.c	2001/05/07 21:06:16	1.6
+++ lib/dump.c	2001/08/28 14:25:52
@@ -88,6 +88,16 @@
+void w_i18n_charfield(struct varbuf *vb,
+                 const struct pkginfo *pigp, const struct pkginfoperfile *pifp,
+                 const struct fieldinfo *fip) {
+  const char *value= pifp->valid ? PKGPFIELD(pifp,fip->integer,const char*) : NULL;
+  if (!value || !*value) return;
+  varbufaddstr(vb,fip->name); varbufaddstr(vb, ": "); 
+  varbufaddstr(vb, vb->l10n ? GETTEXT_DESC(value) : value);  
+  varbufaddc(vb,'\n');
 void w_filecharf(struct varbuf *vb,
                  const struct pkginfo *pigp, const struct pkginfoperfile *pifp,
                  const struct fieldinfo *fip) {
@@ -208,7 +218,7 @@
   if (pifp->valid) {
     for (afp= pifp->arbs; afp; afp= afp->next) {
       varbufaddstr(vb,afp->name); varbufaddstr(vb,": ");
-      varbufaddstr(vb,afp->value); varbufaddc(vb,'\n');
+      varbufaddstr(vb,vb->l10n ? GETTEXT_DESC(afp->value) : afp->value); varbufaddc(vb,'\n');
@@ -218,6 +228,7 @@
   struct varbuf vb;
+  varbufdol10n(&vb);
   if (fputs(vb.buf,file) < 0)
Index: lib/parse.c
RCS file: /cvs/dpkg/dpkg/lib/parse.c,v
retrieving revision 1.20
diff -u -r1.20 parse.c
--- lib/parse.c	2001/05/02 04:08:31	1.20
+++ lib/parse.c	2001/08/28 14:25:52
@@ -68,7 +68,7 @@
   { "Size",             f_filecharf,       w_filecharf,      FILEFOFF(size)           },
   { "MD5sum",           f_filecharf,       w_filecharf,      FILEFOFF(md5sum)         },
   { "MSDOS-Filename",   f_filecharf,       w_filecharf,      FILEFOFF(msdosname)      },
-  { "Description",      f_charfield,       w_charfield,      PKGIFPOFF(description)   },
+  { "Description",      f_charfield,       w_i18n_charfield, PKGIFPOFF(description)   },
   /* Note that aliases are added to the nicknames table in parsehelp.c. */
   {  NULL   /* sentinel - tells code that list is ended */                               }
Index: lib/varbuf.c
RCS file: /cvs/dpkg/dpkg/lib/varbuf.c,v
retrieving revision 1.11
diff -u -r1.11 varbuf.c
--- lib/varbuf.c	2001/05/01 05:35:00	1.11
+++ lib/varbuf.c	2001/08/28 14:25:53
@@ -88,12 +88,16 @@
 void varbufinit(struct varbuf *v) {
   /* NB: dbmodify.c does its own init to get a big buffer */
-  v->size= v->used= 0;
+  v->size= v->used=v->l10n= 0;
   v->buf= NULL;
 void varbufreset(struct varbuf *v) {
   v->used= 0;
+void varbufdol10n(struct varbuf *v) {
+  v->l10n= 1;
 void varbufextend(struct varbuf *v) {
Index: main/query.c
RCS file: /cvs/dpkg/dpkg/main/query.c,v
retrieving revision 1.8
diff -u -r1.8 query.c
--- main/query.c	2001/08/01 13:41:17	1.8
+++ main/query.c	2001/08/28 14:25:53
@@ -92,7 +92,7 @@
   const char *pdesc, *p;
   int l;
-  pdesc= pkg->installed.valid ? pkg->installed.description : 0;
+  pdesc= (pkg->installed.valid ? GETTEXT_DESC(pkg->installed.description) : 0);
   if (!pdesc) pdesc= _("(no description available)");
   p= strchr(pdesc,'\n');
   if (!p) p= pdesc+strlen(pdesc);

Reply to: