Re: Debian in Google's Index
Le Wed, Mar 19, 2003 at 08:00:00PM +0100, Josip Rodin écrivait:
> On Wed, Mar 19, 2003 at 10:20:56AM -0800, Gabriela Valdes wrote:
> > As you may know, Google's mission is to deliver the best search
> > experience on the Internet by making the world's information universally
> > accessible and useful. Currently, our Google.com site is within the top
> > 10 sites in all major markets worldwide with over 200 million searches
> > per day.
> >
> > We believe that www.debian.org is a great site and have discovered that
> > Google is currently blocked from crawling your site by the robot.txt
> > that is on your site. I believe we can drive a lot of traffic and
> > awareness to your organization and would like to find a mutually
> > beneficial way to work together.
>
> How's that? http://www.debian.org/robots.txt says only:
>
> User-agent: *
> Disallow: /security/
> Disallow:
What's the purpose of the empty Disallow ?
> I'm not sure why we ban /security/, but otherwise it should be perfectly
> possible to crawl the remaining 773 MB of www.debian.org...
I wanted to look at the CVS log to find a possible reason but this file
is not managed by CVS. :-|
I really don't see why we keep that Disallow, it's not a dynamic site
with infinite recursion or anything like that. Sure it changes often but
that's not a big problem imho ...
Cheers,
--
Raphaël Hertzog -+- http://www.ouaza.com
Formation Linux et logiciel libre : http://www.logidee.com
Reply to: