[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Categorial Browsing / Categorization of Packages

in the spirit of "don't only suggest stuff, implement it"
i wrote a small perl script to access the categorization data created
by Daniel Burrows for his (great) aptitude package manager frontend.
Actually i've done a similar thing before, you might remember the web
pages at   http://people.debian.org/~erich/packagebrowser/
which are a static rendition of this data.
Unfortunately the data has gotten quite outdated with over 5 thousand
packages not being categorized at all.

While the solution to keep this data current should be that each
package's maintainer adds the relevant information to the control files,
this would need to be added to policy etc.
To get this project going i wrote two small perl cgi's that allow
browsing and a little editing of the data stored in a mysql database.


What you can do right now is:
- check if your packages are categorized
- check if the software you know well (use often) is categorized

What you need to do in order to categorize software is:
1. Browse to the destination category. Please don't use too generic
   categories right now. Feel free to add new categories if you know a
   few packages that fit in there (such as a webmin category under
   administration). I think it's best to add a new category for big
   software packages (i.e. with many packages such as mozilla)
   and add the major package to both the outer and the inner category,
   while adding the components to the inner category only.
2. Click on the "Add a package to this group" link.
3. Type the package name. Typos will result in bogus data in my
   database; no spell checking done yet and click ok.

You can't right now:
- edit packages except for adding them
- edit categories except for adding them

If you want to do many changes, you can either use wget to script this
or you could send me a CSV file with the category and package names.

Please don't flood me with requests: i have major exams throughout the
next three weeks, so i shouldn't have written this cgi...
In fact it's less than 15 hours till one test in computer science.

If too much bullshit is being entered into the web interface, the server
has too high load, it's abused in any other way and so on i guess i'll
just shut it down and consider it a failure.
As long as the data is useable (a cronjob is trying to do backups ;)
i will, of course, make the data available; either in aptitude file
format or as mysql dumps.

I've sent some suggestions on this issue before, you might want to check
the mailing list archives:
(comitte for category definition to limit the number of categories etc.)

My newer ideas were included in my woody+1 wishlist at
(not updated for a long time...)
I think my suggestion there is much more flexible (weighted tags such as
licence:free 1.0) but harder to implement and i'm not sure if it does
provide much benefit.

Enjoy, and leave me alone for my exams ;)
I just did my Vordiplom (similar to bachelor) in mathematics (subsidary
subject computer science) a few weeks ago, now i'm doing it the other
way round as well, Computer science with subsidary subject mathematics.
Not sure yet whether i will do my Diploma in maths or in cs, or even
both, neither if i'll head for PhD afterwards. Time will tell.
I guess i tend towards CS, while everybody tells me to stay at
mathematics... ;) I'll do an advanced seminar on Algebra (Galois theory
on inseparable extensions: higher derivations) as well as one on
Artificial Intelligence (statistical methods, bayes and such) next term.
Well, exams first. Courses second. Debian later. Should be at least.

Erich Schubert
      erich@(vitavonni.de|debian.org)    --    GPG Key ID: 4B3A135C     (o_
             There are only 10 types of people in the world:            //\
             Those who understand binary and those who don't            V_/_
   In unseren Freunden suchen wir, was uns fehlt. --- Thornton Wilder

Reply to: