Re: [GSoC] Reg Blends Web Sentinel

To: debian-blends@lists.debian.org
Subject: Re: [GSoC] Reg Blends Web Sentinel
From: Andreas Tille <andreas@an3as.eu>
Date: Thu, 5 Mar 2015 22:25:35 +0100
Message-id: <[🔎] 20150305212535.GB11433@an3as.eu>
In-reply-to: <[🔎] CAKW+_ezAgBfm8AtYgeL=oSbbv=CbqvtgzcFAyKd7hGTxtfF_2A@mail.gmail.com>
References: <[🔎] CAKW+_exdZF0bdqPgD05MzdewB6egsqjQgH_dJ0nfxCz4AMC2FQ@mail.gmail.com> <[🔎] 20150302170921.GE5686@an3as.eu> <[🔎] CAKW+_eyROSCRcHpMQBf0fFiQYmb8OfCxLbnM7YgVrDGRNPJCZg@mail.gmail.com> <[🔎] CAKW+_ewxRcfMa10m-0OyB3i9j2eQfWsERScg7npEOSX0AZPyVw@mail.gmail.com> <[🔎] 20150305135528.GG22418@an3as.eu> <[🔎] CAKW+_ezAgBfm8AtYgeL=oSbbv=CbqvtgzcFAyKd7hGTxtfF_2A@mail.gmail.com>

Hi Akshita,

On Thu, Mar 05, 2015 at 10:28:01PM +0530, Akshita Jha wrote:
> On Thu, Mar 5, 2015 at 7:25 PM, Andreas Tille <andreas@an3as.eu> wrote:
> 
> >     for kind, data, pos in stream:
> >   File "/usr/lib/python2.7/dist-packages/genshi/output.py", line 786, in
> > __call__
> >     text = escape(pop_text(), quotes=False)
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 188:
> > ordinal not in range(128)
> >
> > If there is a similarly simple solution this would be really great.
> >
> I have sent another patch which sends 'long_description' to to_unicode().

Ahh, you are perfectly correct.

> Now, './bugs_udd.py debian-games' runs without any error. Also, there are
> no errors for debian-junior and debian-med. I have a small question though
> - now we are just sending specific items to to_unicode function like:
> to_unicode(_name), to_unicode(data['idxsummary']), to_unicode(bug[k]),
> to_unicode(t['long_description']) etc, but in future can there be cases
> where we have, say 't['title']' with unicode characters ? If yes, then
> we'll need to send this also to to_unicode(). I am not very sure, but to
> solve such cases will it be a good idea to send a variable which contains
> all this data to to_unicode(), instead of sending every item individually ?
> I am not very familiar with the code base just yet, so can you please
> advise me on this ?

I can no give any better excuse than that the code is that way for
historical reasons.  When I started there was less need for any
conversion and the more data went in (due to more Blends and more
packages) the more conversions were needed.  I perfectly agree that it
would make way more sense to do a reliable conversion in one rush.  On
the other hand I'm not fully sure whether we can assume that in the
current form all data are featuring the same encoding.  Currently we
obtain data from UDD as well as from the tasks files themselves.  So
there might be differences and a common conversion might fail.  However,
the task of the GSoC project is to translate all data from the tasks
files into UDD tables first and than create the web pages from UDD as
only data source.  Once this will be done we should definitely implement
the consistent conversion you suggested above.

> > If we are really lucky this will also help in the next annoying
> > case:
> >
> >    tasks-py debian-med
> > Traceback (most recent call last):
> >   File "./tasks.py", line 181, in <module>
> >     print >> f, template.generate(**data).render('xhtml')
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
> > position 1274: ordinal not in range(128)
> >
> >
> When I try to run  './tasks.py debian-med', I get the following error:
> 
> svn: E000111: Unable to connect to a repository at URL 'svn://
> anonscm.debian.org/svn/blends/projects/med/trunk/debian-med/tasks'
> svn: E000111: Can't connect to host 'anonscm.debian.org': Connection refused

Please try whether you are able to

 svn co svn://anonscm.debian.org/svn/blends/projects/med/trunk/debian-med/tasks /srv/blends.debian.org/data/med/tasks
 svn co svn://anonscm.debian.org/svn/blends/projects/med/trunk/debian-med/debian /srv/blends.debian.org/data/med/debian

which is what is done in the code.  If there is some problem here we
should check this.  BTW, I intend to convert debian-med also to Git but
not before Jessie is released.  However, some other Blends might remain
in SVN which is no real problem IMHO since this usually worked and there
is no real need to deal with SVN other than this single checkout.

> Can you please help me with the error ? Is it some configuration problem of
> svn in my system or something else?

Thanks again for your very welcome contribution

     Andreas.

-- 
http://fam-tille.de

Reply to:

References:
- [GSoC] Reg Blends Web Sentinel
  - From: Akshita Jha <zenith158@gmail.com>
- Re: [GSoC] Reg Blends Web Sentinel
  - From: Andreas Tille <andreas@an3as.eu>
- Re: [GSoC] Reg Blends Web Sentinel
  - From: Akshita Jha <zenith158@gmail.com>
- Re: [GSoC] Reg Blends Web Sentinel
  - From: Akshita Jha <zenith158@gmail.com>
- Re: [GSoC] Reg Blends Web Sentinel
  - From: Andreas Tille <andreas@an3as.eu>
- Re: [GSoC] Reg Blends Web Sentinel
  - From: Akshita Jha <zenith158@gmail.com>

Prev by Date: Re: [GSoC] Reg Blends Web Sentinel
Next by Date: [GSoC] Profiling SQL query (Was: [GSoC] Reg Blends Web Sentinel)
Previous by thread: Re: [GSoC] Reg Blends Web Sentinel
Next by thread: [GSoC] Profiling SQL query (Was: [GSoC] Reg Blends Web Sentinel)
Index(es):
- Date
- Thread