Regular automated backup of WordPress blog
Monday, January 14th, 2008For my old uboot blog I had a very simple backup strategy.
The whole blog was available as an RSS feed. (I.e. not just the latest entries: the feed included every article right back to the beginning.) I set up a crontab job on a friendly Linux machine to download that with “wget”, then commit the resulting file to my private Subversion repository.
A similar technique is available with Wordpress. It has an “export” feature which allows you to download the entire blog in an “extended” RSS format, including comments etc.
You have to be logged in to the Wordpress admin system to use the export. However “wget” takes the “–accept-cookies” parameter which takes a “cookies.txt” file. The documentation assumes you’ll be running your browser on the same machine so it expects you’ll have a “cookies.txt” file available. I didn’t but it was a small matter to take the “cookies.txt” file from my Windows machine - where Mozilla correctly stores the file under the “Documents and Settings” Windows directory - find the two lines for the cookies “wordpressuser_nnn” and “wordpresspass_nnn” (the “nnn” is a long hex string) and strip out the rest. “scp” across to the Linux machine and “wget” accepted the file fine.
A small change to my script now downloads this Wordpress export, and commits it into my Subversion as before.
There was the slight inelegance that Subversion only creates a new revision when the files being committed have actually have different content: and I like this feature. This worked fine with the Uboot RSS feed but the Wordpress export includes a comment “generated on <date/time to minute accuracy>”. So every night when the script runs a new revision would be created. No matter, I entered a line using “sed” to strip that out from the comment, committed it, then did a “svn diff” to check that really only that line had been changed.
So now my script looks like:
#!/usr/local/bin/bash # in crontab: # 0 4 * * * ~/private-svn/blog/backup/download-backup.sh cd ~/private-svn/blog/backup export SITE=http://www.databasesandlife.com/ export FILE=databasesandlife.wordpress.xml wget –quiet \ –output-document $FILE.tmp –load-cookies cookies.txt \ "$SITE/wp-admin/export.php?author=all&download=true&submit=xx" # the created="xx" attribute in a comment causes each download # to be a new commit; yet i only want actual changes to show # up in the subversion revision history sed 's/created="....-..-.. ..:.."–>/–>/' < $FILE.tmp > $FILE svn commit -m '* Automatic blog download from crontab' $FILE