Breaking Down Google Sitemaps XML
Since Danny Sullivan already covered the overview of Google Sitemaps, I’m going to take some time to explain the Sitemap protocol.
What is the Google Sitemap Protocol?
The Google Sitemap Protocol allows you to tell Google what URLs on your web site is ready to be crawled. The Sitemap contains a list of URLs and some meta data about the URL such as when they were last modified, how frequently the content changes, and the priority of the page relative to other pages.
The Google Sitemap is in an XML format using some very simple XML tags. So if you know how to alter HTML files, you will be fine with Google Sitemaps. XML is a bit more strict than HTML, so you will need to remember to encode all your data values (fix those &’s!).
What does a Google Sitemap look like?
A Google Sitemap uses 6 XML tags:
- changefreq
- lastmod
- loc
- priority
- url
- urlset
This is how part of my Google Sitemap looks:
<?xml version=”1.0″ encoding=”UTF-8″?>
<urlset xmlns=”http://www.google.com/schemas/sitemap/0.84″>
<url>
<loc>http://www.socialpatterns.com</loc>
<lastmod>2005-06-03T04:20:36Z</lastmod>
<changefreq>always</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.socialpatterns.com/new-post/</loc>
<lastmod>2005-06-02T20:20:36Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
</urlset>
What does that all mean?
I’ll explain each section so you can get a better idea of what each line means.
<?xml version=”1.0″ encoding=”UTF-8″?>
This defines the file as XML and sets the correct UTF-8 encoding.
<urlset xmlns=”http://www.google.com/schemas/sitemap/0.84″>
This line defines the xml schema that the rest of the file will be using. Basically this is equivalent to the <html> tag.
<url>
This starts a URL entry. For each URL that you will include in your sitemap, you will need this tag and its closing tag.
<loc>http://www.socialpatterns.com</loc>
This tells Google what URL to crawl. One thing to note here, the URL must be less than 2049 characters.
<lastmod>2005-06-03T04:20:36Z</lastmod>
This tells Google when this URL’s content was last modified. This helps Google determine how recent this URL is. The time needs to follow ISO 8601 format.
<changefreq>always</changefreq>
This tells Google how often this URL is updated. You can define this as “always”, “hourly”, “daily”, “weekly”, “monthly”, or “yearly”.
<priority>0.8</priority>
This shows Google how important this URL is compared to the rest of the URLs. This can range from 0.0-1.0. The higher the number, the more priority you are assigning.
</url>
This is the closing tag for one URL entry. If you have more URLs you want to include, you would repeat the above for as many URLs as you want.
</urlset>
This closes off your URL set and finishes up your Google Sitemap.
Ok so how do I submit the darn thing?
After you are done creating your Google Sitemap, you will need to submit it. First head over to the Google Sitemaps homepage and sign in (you’ll need a Google Account). After logging in you will be taken to a screen that looks like this.
Click on Add a Sitemap.
Enter in the URL to your Google Sitemap and click Submit URL. And that’s it!
Update: Added line about UTF-8 encoding. Also, if you need example code for one head over to my Google Sitemaps with Wordpress article.
Michael Nguyen | Search Engine Optimization, Technology | Comments (23)




June 5th, 2005 at 9:41 am
1- Does it have to be in ” .php” format when I save the page????
2- Are there other formats to use???
Thanks
RH
June 5th, 2005 at 9:50 am
If you made a mistake and had to resubmit your sitemap, will Google still download your page again?
RH
June 5th, 2005 at 9:52 am
I used ” .htm” extension, and I deleted the html format and used xml format instead..
June 9th, 2005 at 5:19 am
How can I make a webpage that actually shows the generated sitemap to my web visitors?
Is it possible to title and categorize the links on that webpage?
Thanks
June 9th, 2005 at 6:59 am
Richard:
No the page does not need to be saved as .php. Google will download your newly submitted page. From time to time, Google will revisit your submitted page and download it.
Eric:
Are you asking about Google Sitemaps? Or just a sitemap in general? Here’s a thread over at the Wordpress forums about generating a normal sitemap.
June 9th, 2005 at 5:59 pm
Hi. My web server doesn’t support Python, so I can’t use Google site map generator. Is there another generator you’d recommend? If not, I’ve read the text about creating a simple text file site map, but it says the files must use UTF-8 encoding. What does this mean? BTW, thanks heaps for posting such a helpful page.
June 9th, 2005 at 7:48 pm
Glenn:
You might just want to create your Google Sitemap by hand if your site doesn’t have too many pages.
As long as you start off your Google Sitemap file with this line:
<?xml version=”1.0″ encoding=”UTF-8″?>
You won’t need to worry about the UTF-8 encoding.
June 26th, 2005 at 1:28 pm
Thanks for the postings. I created an HTML file and will remove the HTML tag and place it with .
My question: is this how I should code the URL ending with .html prefix? Does it need to end with “/”. Thx.
http://www.aventura-realestate.info/Sell-Aventura-Real-Estate/page_668213.html
2005-06-02T20:20:36Z
monthly
0.1
June 26th, 2005 at 1:42 pm
To complete my question: Here is the URL for the file I created:
http://miamicondo.info/site-map-xml.htm
I can not even see it from browser. I am on the right track? Thank you. Andrew
June 26th, 2005 at 5:07 pm
Sorry, after reading more, I renamed it to:
http://miamicondo.info/sitemap.xml
June 26th, 2005 at 10:44 pm
Andrew:
It looks like the first line in the file is incorrect. Check to see that there are quotes around your version number and encoding.
July 4th, 2005 at 1:52 am
Will Google punish a web site if it has a link to a Google sitemap on it?
I added a ’sitemap’ link to my sitemap.xml file from my homepage. This did wonders for other search engine rankings but my site disappeared from Google. Was this just coincidence?
Thanks, Chris
September 26th, 2005 at 11:45 am
[...] Bueno creo que ya todos saben lo que es el Google Sitemap, sino pues busquen en google o technorati que media blogosfera habla de ello. Gracias a Michael de Social Patterns pude hacer mi propio google sitemap para wordpress. No es muy dificil, y aqui les dejo el codigo. Para entender un poco el codigo, leer aqui. Subes el sitemap.php a tu servidor y luego a tu cuenta de sitemaps y listo, a esperar a que el google te visite. [via Social Patterns] Saludos The Ghost [...]
November 20th, 2005 at 1:37 pm
Hi There
I have Submited google sitemap three day a go, but still google result are not refresh i mean not indexed properly what i can do ? there only my home page indexed i think, because when i search on google for site:www.katarey.com i got only my home page listed
December 12th, 2005 at 4:46 pm
[...] Breaking Down Google Sitemaps XML from Social Patterns [...]
January 8th, 2006 at 11:05 am
[...] You can refer back to my previous post to read more about the Google Sitemaps Protocol and how you can submit your Google Sitemap. [...]
February 21st, 2006 at 2:59 pm
[...] Breaking Down Google Sitemaps XML [...]
April 20th, 2006 at 8:40 pm
[...] taken from : http://www.socialpatterns.com [...]
April 29th, 2006 at 10:55 am
Virtual Hand-Holding: Aids in Making Sitemaps
In virtual hand-holding, it is an important tip to remember that people visit the site in order to look for some information. Internet surfers are an unforgiving lot. They only visit sites that are useful to them. With thousands of websites out there that offer the same features as your own site, how do you stand out? This is where sitemaps come in.
May 6th, 2006 at 9:45 pm
is this a regular process or i have to do it over and over again?
May 26th, 2006 at 1:14 pm
Keep a good job up!
July 10th, 2006 at 10:45 pm
hi,i need one soft to create my sitemap!thanks.
February 12th, 2007 at 12:25 am
One can use free tools to generate Sitemap XML, It is really good tools. http://www.sitemapbuilder.net