Overview
Artifact ID: | bd3cc20def4841629acf514fb00275889f98d632 |
---|---|
Ticket: | 03c9f4deef2d19f55945415cd1d048b4d730cf8c
Disable built-in robots.txt file |
User & Date: | anonymous 2010-01-15 17:21:45 |
Changes
- comment changed to:
Per [http://thread.gmane.org/gmane.comp.version-control.fossil-scm.user/1489]: <verbatim> On Dec 2, 2009, at 10:52 AM, Daniel Clark wrote: > For a while it's seemed odd to me that google hadn't updated its > indexes of a few sites, so I took a look and noticed the obvious - > robots.txt was disallowing all crawlers: > > (from main.c) > > /* Prevent robots from indexing this site. > */ > if( strcmp(g.zPath, "robots.txt")==0 ){ > cgi_set_content_type("text/plain"); > @ User-agent: * > @ Disallow: / > cgi_reply(); > exit(0); > } > > As far as I can tell there isn't a way to disable or tune this from > the fossil level; if there isn't interest in changing this I'm sure I > can just redirect via apache to some actual file, but IMHO it would be > good to be able to easily make fossil projects searchable (perhaps even > have this be the default), esp. since at the moment the only reason that > say fossil-scm.org is searchable is because the robots.txt file happens > to be at: > > http://fossil-scm.org/index.html/robots.txt > > (eg a random apache configuration choice to have rather ugly URLs with > "index.html" in all of them.) Note that http://www.fossil-scm.org/ does not use apache. The redirect occurs within fossil itself. I suppose that since the existing "robots.txt" is essentially a no-op, we might as well remove it. </verbatim>
- foundin changed to: "20100112141013"
- private_contact changed to: "ace2a438fb254e5d21ba2ed1295b2da8392d271c"
- severity changed to: "Minor"
- status changed to: "Open"
- title changed to: "Disable built-in robots.txt file"
- type changed to: "Code_Defect"