LesPerras.com

How to make a Sitemap with google-sitemap_gen on Nginx

I wanted to make a sitemap for my site to submit to google, but I didn’t want to pay for a sitemap generator, as my site is not a commercial site.

google-sitemap_gen is one of the first resources I found.

The Problem is…

The problem is that it is described for use with apache but I am using the nginx server. A bit more searching lead me to this page and Wing Tang Wong’s answer.

He Said

He said he made a url list by spidering with wget, and writing to wget-log. Then use that source of urls to make the sitemap.xml with the google-sitemap_gen. I donwloaded the sitemap generator package and unzipped it.

Next I modified the config file like he says on the linked page above.

#<?xml version="1.0" encoding=UTF-8?>
<site
base_url="http://YOURDOMAIN.com/"
store_into="/var/www/sitemap_gen-1.4/sitemap.xml"
verbose=1
>
<url href="http://YOURDOMAIN.com/stats?q=name" />
<url
href="http://YOURDOMAIN.com/stats?q=age"
lastmod="2004-11-14T01:00:00-07:00"
changefreq=yearly
priority="0.3"
/>
<urllist path="urllist.txt" encoding=UTF-8 />
<!-- Exclude URLs that end with a '~' (IE: emacs backup files) -->
<filter action=drop type=wildcard pattern="*~" />
<!-- Exclude URLs within UNIX-style hidden files or directories -->
<filter action=drop type=regexp pattern="/\.[^/]*" />

He spidered with

wget -mk —spider -r -l2 http://YOURDOMAIN.COM/

but I found it worked if I did it like this:

 wget -mk --spider -r -l2 http://YOURDOMAIN.COM/ -o wget-log

Then his next command to make the urllist worked nicely:

cat wget-log | tr ' ' '\012' | grep "^http" | egrep -vi "[?]|[.]jpg$" | sort -u > urllist.txt

No problems after ensuring that the wget-log file was written and in place. then all that remained was to use the google-sitemap_gen with the command:

python sitemap_gen.py --config=example_config.xml 

And presto! You have a nice sitemap. Just send it to google on their webmaster tools page.

Get My Free E-book

Cleaner Living: Breathe through life's difficulties

Learn the scanning breath technique that helped me dissolve decades of emotional baggage. Discover practical tools you can use anywhere to process difficult emotions and find lasting well-being.

We respect your privacy. Your email will only be used to send the download link.