Fossil: Ticket Change Details

Overview

Artifact ID:	bd3cc20def4841629acf514fb00275889f98d632
Ticket:	03c9f4deef2d19f55945415cd1d048b4d730cf8c Disable built-in robots.txt file
User & Date:	anonymous 2010-01-15 17:21:45

Changes

comment changed to:

Per [http://thread.gmane.org/gmane.comp.version-control.fossil-scm.user/1489]:

<verbatim>
On Dec 2, 2009, at 10:52 AM, Daniel Clark wrote:

> For a while it's seemed odd to me that google hadn't updated its  
> indexes of a few sites, so I took a look and noticed the obvious -
> robots.txt was disallowing all crawlers:
>
> (from main.c)
>
>  /* Prevent robots from indexing this site.
>  */
>  if( strcmp(g.zPath, "robots.txt")==0 ){
>    cgi_set_content_type("text/plain");
>    @ User-agent: *
>    @ Disallow: /
>    cgi_reply();
>    exit(0);
>  }
>
> As far as I can tell there isn't a way to disable or tune this from  
> the fossil level; if there isn't interest in changing this I'm sure I
> can just redirect via apache to some actual file, but IMHO it would be  
> good to be able to easily make fossil projects searchable (perhaps even  
> have this be the default), esp. since at the moment the only reason that  
> say fossil-scm.org is searchable is because the robots.txt file happens
> to be at:
>
> http://fossil-scm.org/index.html/robots.txt
>
> (eg a random apache configuration choice to have rather ugly URLs with
> "index.html" in all of them.)

Note that http://www.fossil-scm.org/ does not use apache.  The  
redirect occurs within fossil itself.

I suppose that since the existing "robots.txt" is essentially a no-op,  
we might as well remove it.
</verbatim>

foundin changed to: "20100112141013"
private_contact changed to: "ace2a438fb254e5d21ba2ed1295b2da8392d271c"
severity changed to: "Minor"
status changed to: "Open"
title changed to: "Disable built-in robots.txt file"
type changed to: "Code_Defect"