This tutorial covers the following topics:
Basic Sitemap example
Here is a very basic example of a sitemap file. For our example here we use a single URL:
<?xml version="1.0" encoding="UTF-8"?>
You can check the next section of this tutorial for a more complicated Sitemap example.
Below we will revise the lines of the sitemap file one by one:
- Every Sitemap XML file must begin with an opening tag <urlset> and must end with </urlset>.
- Every "parent" entry should begin with <url> tag and end with </url>.
- In a similar way, every "child" entry should be placed between <loc> and </loc> tags.
- After a <loc> tag, an URL is expected which should start with http://.The length of the URL can be 2048 characters at most.
- The <lastmod> tag expects a date in the following format YYYY-MM-DD. Be advised that you do not have to modify this tag each time you modify the document. The search engines will get the dates of the documents once they crawl them.
- The <changefreq> tag is used as a hint for the crawlers to indicate how often the page is modified and how often it should be indexed. Note that this value may or may not affect the crawl bot behavior which depends solely on the search engine. The <changefreq> tag expects one of the following values: always, hourly, daily, weekly, monthly, yearly, never. Be advised that always is used for pages which are dynamically generated or changed/modified upon every access. As for the never value – be advised that even if you mark your page with a never value most probably it will be indexed once in a week for example.
- The <priority> value can vary from 0.0 to 1.0.Be advised that this indicates only your personal preferences for the way you would like to have your website indexed. The default value of a page that is not prioritized is 0.5. Any page with higher value will be crawled before the page with priority 0.5, and all pages with lower priority will be indexed after the page with 0.5 value. Since the priority is relative it is used only for your website and even if you set a high priority to all of your pages this does not mean that they will be indexed more often, because this value is not used to make comparison between different websites.
Special characters in the Sitemap file
As we have mentioned before your sitemap should be UTF-8 encoded. This can be done when you save your sitemap file. Almost all text editors support saving in UTF-8 format.
Be advised that all data in the Sitemap should use entity escape codes for the characters listed below:
Character Escape Code
Ampersand - & - &
Single Quote - ' - '
Double Quote - " - "
Greater Than - > - >
Less Than - < - <
Don't forget, that your sitemap should be no larger than 50MB.