Bug#949173: packages.debian.org: robots.txt doesn't actually block anything
On Sat, 18 Jan 2020 01:14:02 +0000
Paul Wise <pabs@debian.org> wrote:
> On Fri, Jan 17, 2020 at 6:57 PM Adam D. Barratt wrote:
>
> > which is effectively the same as allowing everything. "Disallow: /"
> > might be more logical, unless there is a desire / requirement to allow
> > crawling and indexing of (parts of) the site.
>
> I expect we want to allow crawling the site, all of the pages are
> public and most of them are useful for search engines to index.
+1 This is something I've found very helpful and convenient. I could survive
without those pages being indexed by search engines, but I'd prefer not.
Would it be helpful to disallow certain pages, such as */download?
This is random, but I noticed that the "Tags" link on package pages links to
debtags.alioth.debian.org/edit.html, which no longer exists.
Reply to: