Home » Apache

Sitemap generator for search engines

5 May 2009 2 Comments

Sitemaps are useful if you want search engines to look in specific directories of your website. The standard robots.txt notation only has the exclusion list; where not to look and the search frequency.

For instance, a really basic robots.txt file looks like this:

User-agent: *
Crawl-delay: 3
Disallow:/cgi-bin/

For me, I set the Crawl-delay to 3 as a general rule to prevent crawlers from consuming all the web server bandwidth. Generally, Yahoo crawlers are the most aggressive on your site, Google averages about ~13 seconds per request. Anyway, a sitemap gives the crawler a better idea of where to search, rather than trying to discover on its own by looking at the root file.

Here’s a great resource for generating sitemaps:
http://code.google.com/p/sitemap-generators/wiki/SitemapGenerators

About Google Sitemap Generator

Our new open-source Google Sitemap Generator finds new and modified URLs based on your webserver’s traffic, its log files, or the files found on the server. By combining these methods, Google Sitemap Generator can be very fast in finding these URLs and calculating relevant metadata, thereby making your Sitemap files as effective as possible. Once Google Sitemap Generator has collected the URLs, it can create the following Sitemap files for you:

2 Comments »

  • gyrolistic said:

    Hey Al, what are your thoughts about using Google’s sitemap.xml service/technique? https://www.google.com/webmasters/tools/docs/en/protocol.html

  • admin (author) said:

    I really like Google’s sitemap generator. I think its the way to go, considering Google is the primary source for traffic on my sites. Using the sitemap generator will allow spiders will do a more thorough job of going through your pages. Whats good for SEO is good for me.

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.