Re: Guidance required for GSoC project PTS rewrite in Django
The Debian archive presently only exports an apt archive and sends
events out as emails, mainly aimed at humans not machines. This year
there is a planned GSoC project to change this and make dak spit out
AMPQ events using fedmsg, enabling Debian and other services to get
realtime updates about package uploads etc.
Then there is UDD (Universal Debian Database), which is a project to
import all sources of data about Debian into an SQL database
(currently PostgreSQL). It has a lot of data but the data in UDD is
different to the PTS (some extra, some missing).
The PTS was created before UDD. It is pulls data in from a variety of
sources (update_incoming.sh), converts that to xml (excuses_to_xml.py
other_to_xml.py sources_to_xml.py) and converts that to HTML
(generate_html.sh) using XSLT. Some data is also pushed to it via
email (the news mainly).
I'm not sure how Stefano intended for you to change the the PTS and
I'm not the mentor for this project, but here are my thoughts...
Some things that would be desirable:
Realtime updates so that developers can get correct information. The
current situation is suboptimal.
Static HTML since it is faster to load and means we can distribute the
content to multiple hosts in case of downtime (not done yet).
My suggested approach is as follows. Most of these steps can be done
Update UDD so that it contains all the data imported by the PTS.
Work with Simon Chopin and the UDD folks to get UDD updated in
realtime using AMPQ messages output by dak.
Work with the maintainers of other services to add AMPQ message output.
Rewrite the PTS XSLT templates to Django templates.
Add a mechanism for producing static HTML from Django and update those
static HTML files in realtime in response to AMPQ and email messages.
Work with DSA (Debian sysadmins) to get the static HTML distributed
from static.debian.org. I thought I sent a mail about this but it
didn't reach the list, so I've resent it:
Answers to your ideas:
The PTS is simply a static site with almost all data public, it
doesn't need logins/admins/etc. No need for passwords since we don't
have them currently. Passwords are also a less than ideal
authentication mechanism, if we need authentication at all we should
use something better like OpenGPG keys or client-side SSL