[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Please help with unicode problem on tasks page generation



Hi,

I commited on

  http://svn.debian.org/wsvn/cdd/cdd/trunk/webtools/?rev=0&sc=0

code that generates i18n tasks pages using DDTP descriptions - which
means it WOULD do so if I would not have to struggle with a really stupid
UTF-8 problem.  If you call

   tasks.py 2> tasks.err

you get all the problematic descriptions (140kByte) which seem to be
all non-ASCII character containing texts.  Even in untranslated descriptions
does this problem happen - just grep for "(lang='en')" in the output above.
The strange thing is that genshi.Markup tries unicode(string) which fails -
but it should not, because all the descriptions are properly formated
UTF-8 strings.

I verified that the code would work otherwise because some short
descriptions in the German translation came through.  So if you could
lift me above this hurdle we would come quite close to translated
tasks pages.

What else needs to be done:

  1. Translation of some fixed texts.
     I'm not convinced that I found the most clever way to inject translated
     strings via
         data['stringname'] = _('String to translate')
     and use
         ${stringname}
     in the template.  There must be a more clever way but for the moment
     it works and if you have no better idea to put the string into the
     template directly which leaves the problem to tell genshi for which
     language to translate, we can finish all strings this way and get rid
     of all the fixed strings in the template.  In principle it is probably
     not worth the effort to think about whether this is elegant or not.

  2. Joining the scripts tasks_idx.py and tasks.py to have only a single
     script generating index and single tasks pages.

  3. Obtain also DDTP translations of meta package descriptions to
     render this in the left column of the tasks page and at the index page.

  4. Take over all the surrounding stuff for other CDDs as it is in the
     current update_tasks script.  I think about putting the CDD specific
     stuff into a config file to enable easy adding of further CDDs but
     I have not yet decided about the format of the config file.

Kind regards and I would be very happy if someone could tackle this unicode
problem

      Andreas.

--
http://fam-tille.de


Reply to: