[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#458939: allow search engines to index http://bugs.debian.org

On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote:
> Getting smarturl.cgi properly done is still probably the real solution.

Okay, so I've made smaturl.cgi work again; it was broken by:

   - Debbugs::CGI not accepting params from ARGV (smarturl.cgi changed
     to set QUERY_STRING)

   - Debbugs::CGI, pkgreport.cgi and version.cgi assuming the CGI's are in
     the current HTTP path (added "/cgi-bin/")

I've made those changes on rietz directly; what's the procedure
for committing them? "sudo -u debbugs -H bzr commit" ? There was a
pre-existing change in pkgreport.cgi (adding a"^" to the "Go away"
regexp) that also wasn't committed fwiw.

I think the best solution is to deal with URL naming in the long term
as follows:

   bugs.debian.org/123456          (bug report)
   bugs.debian.org/123456/mbox     (full bug mbox format)
   bugs.debian.org/123456/10       (individual message)
   bugs.debian.org/123456/10/mbox  (individual message mbox format)
   bugs.debian.org/123456/10/att/3 (attachment to a message)

   bugs.debian.org/source/dpkg     (bugs against dpkg in unstable)

   bugs.debian.org/source/dpkg/1.14.14   (bugs against dpkg 1.14.14)



These should all accept settings like boring=yes, reverse=yes,
repeatmerged=no from cookies, but _shouldn't_ accept any parameters on
the URL. That is, these should be the default views everyone gets and
per-user configuration should be done with cookies.

Only when you want to look at a customised version of a particular
page (like "show me this bugreport reversed") or more complicated
queries ("show me bugs with these three tags set") should you hit
/cgi-bin/pkgreport.cgi URLs.

As such, internal links from bug pages back to package pages and so on
should simply use the smarturl urls above, and not worry about all the
parameter parsing.

At that point, we should make smarturl.cgi active, and only prevent bots
from indexing /cgi-bin afaics.

Does that sound reasonable?


Attachment: signature.asc
Description: Digital signature

Reply to: