Tuesday’s Tool: Robot.txt Generator
Yes I know its Thursday. I thought of the idea yesterday and Thursday’s Tool didn’t sound as good as Tuesday’s Tool.
Google announced that they are adding a robot text generator to their webmaster tool line up.
While I do like that Google is heightening awareness level of certain elements of a website the average user may not know about ( like robot.txt files and xml sitemaps) I am also wary about just how much Google KNOWS about me.
I am uneasy that I have to login in and register a site that I want to create a tool for. A post for another day, however.
A robot.tx file is a little file that tells who and where they can go on your site- any site not just a blog.
it can be a simple as allow everyone in or putting up stops signs to certain pages or areas you don’t want followed and indexed.
Do I Need a Robot.txt File On My Server?
No. You site will be crawled and indexed without one- all the pages will accessible. A robot.txt file lets you shape they way they crawl your site and BLOCK specific content.
Why Would You Want to Stop Search Engines From Crawling A Site?
Nature of the blog is to publish content. So when you create a post it will be slotted into different spot. The content will be sorted into archives and categories and by those zillions of little tags you have created. Duplicate content.Duplicate content is to be avoided- some of the content will be dropped from index , dumped into supplement results or steal page rank.
I will also block or disallow section of my site – no need for Google to index my admin section, javascript or css. Those sections might be taking a piece of the page rank pie that you want to to share.
Google’s Robot.txt Generator
You can create a Robot,txt file in Google’s Webmaster Tool’s Actually bit tricky to find- you need to login and click on your verified site (and frankly creating your own robot.txt file would be faster). It lets you decide basic allows/disallows.I find it just as easy and less intrusive to make your own robot.txt file. Basically create in notepad and upload to your WordPress root directory.
This is good robot.txt to use in a Wordpress blog, it is my generic robot.txt file that i tweak as unnecessary.
I will say that to avoid duplicate content I do prefer to use dofollow and nofollow plugins.
Other Robot.txt Generators:
http://www.seochat.com/seo-tools/robots-generator/
http://www.mcanerin.com/EN/search-engine/robots-txt.asp
http://www.yellowpipe.com/yis/tools/robots.txt/
download http://www.bigfootwebmarketing.com/downloads/robot.zip















I just uploaded your suggested robot.txt file to my blog root folder…I’ll let my readers know how it works out as i check stats over the next couple of weeks.
I’m thinking it’ll be at least a couple of weeks before noticing results from this.
Caleb’s last blog post..My Top Secret Posts
Just curious, any chance you can explain what the different lines of code mean in your robot.txt sample? For example, User-agent: *
Is this something I want to change? And all the disallows…how is it advantageous to disallow trackbacks and /feed/rss?
Thanks for your help, great blog!
Dr. K’s last blog post..Why the need for human connection?