On Wed, Apr 24, 2013 at 10:28:07PM +0530, Pankaj Kumar Sharma wrote: > In the present system the content is loaded explicitly via cron. The > confusion that bounds me is that what should be methodology that we should > use in the upcoming Django project ? Should the data be loaded at the time > when some one asks for that or it should be present in the databases ? The short answer is: we'll have to experiment with that :) Some of the information that the PTS exposes (e.g. those related to the status of the archive) are "fairly static", meaning that they change at most 4 times a day. Others are "very dynamic" (e.g. bug information) and ideally should really be live, as it could be really confusing for a user to see that, say, a package has 1 RC bug, click on the bugs link, and discover that that's not true. It's not true *anymore*, but the random user would have no way of understanding that and think it's a bug. This kind of incoherences has been an endless source of (bogus) bug reports along the PTS life. A separate question is how to make all this efficient, in term of caching. Obviously, the current solution with static HTML pages is very fast and is also easy to mirror in case of need. A purely dynamic solution would be on the opposite end of the spectrum in terms of performances. We will probably need to stay somehow in the middle, and benchmark the scalability of the new solution (as mentioned in the project description). Ideally, we should cache heavily, either by using Django caching, or by producing actual HTML pages via Django templates (as mentioned by Paul in this thread). And add on top of it heavy cache invalidation mechanisms for live information, like bugs. Alternatively, we might want to cache only the information that are seldomly updated and be entirely dynamic on the live information. Regarding where the data come from, my dream would be to develop a Python abstraction layer over all the data that the PTS uses. And then have various implementation ("backends") of it. One can for instance access directly UDD, another can access a local cache updated by cron (as in the current PTS deployment), another be entirely live, and yet another use mixed solutions. That would allow to more easily experiment with the different solutions. Hope this explains that we don't have yet written-in-stone-answers to your question, and that finding out, via experiments, the right trade-offs will be part of the actual project. Cheers. -- Stefano Zacchiroli . . . . . . . zack@upsilon.cc . . . . o . . . o . o Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o Former Debian Project Leader . . @zack on identi.ca . . o o o . . . o . « the first rule of tautology club is the first rule of tautology club »
Attachment:
signature.asc
Description: Digital signature