[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#67637: robots.txt on www.debian.org



Hi,

Nicolas Lichtmaier wrote:
> > > Current /robots.txt prohibits indexing of many resources that should
> > > be indexed.
> > > 
> > > /Packages/ and /Lists-Archives/ are completelly out of place here
> > > (perpetual URLs pointing to useful, indexable content).
> > 
> > Probably because search engines would overload www.debian.org otherwise.
>
> This can be easily checked.. are there any log analisis that has shown
> this?

A few minutes ago master suffered a DoS (sort of, the load was >80 and you
couldn't do anything) by googlebot which was accessing all the bug reports
and stuff, because the robots.txt file was missing on klecker (it was
forgotten during the move).

I've put the file back on klecker, and removed the obsolete entries (i.e.
the files that don't exist), but I'm definitely leaving Bugs/ and Packages/
in there so that stuff like this doesn't happen anymore.

-- 
Digital Electronic Being Intended for Assassination and Nullification



Reply to: