Date: Saturday, 31 March 2012 11:43 pm (UTC)
robots.txt is usually a very simple text file, which lists the parts that search engines shouldn't go into. Google et al *should* honour it, mostly because it lists routes that would waste both parties time and effort (e.g. the site's internal search results pages and "I don't have a page for that" pages. Lack of this could at one stage cause the googlebot to get lost in an endless Borgain library of generated pages).

It shouldn't be tucked away anywhere, it should be at the root of the site. That's where it lives, if present. It isn't a way to hack a site in itself, it's just a text file.

Possibly the hits that you see are googlebots requesting robots.txt and finding nothing. The last internet-facing site that I worked on, I noticed these requests quite soon. So I made a simple robots.txt. But then I had the luxury of erecting "keep out" sign that didn't list any particular routes, just disallowed everything. You *probably* don't want that if it's how people find you. You could put down a simple one that allows everything to be indexed.

I'm not convinced that it gives much away in terms of things to look for - the fact that you're running wordpress gives much more away IMHO. It may be a red herring.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

December 2024

S M T W T F S
1234567
891011121314
15 161718192021
22232425262728
293031    

Tags

Page generated Monday, 23 June 2025 08:30 am
Powered by Dreamwidth Studios

Style Credit