December 23, 2007
Since running this blog on Media Temple’s grid-server, I was spoiled by the minimal effort required to backup my web stuff online thanks to the Data Backup utility available through the grid-server admin panel. Unfortunately, this utility was pulled over a month ago and I always want a backup to revert to in case an upgrade of WordPress or some other installation goes wrong.
It was unfortunate that their backup utility started to break. Prior to the utility being pulled, my automated backups were no longer occurring and initiating a manual backup sometimes would not work. I wanted to upgrade my WordPress install urgently, so I investigated other methods of backing up; rsync and mysqldump. After getting these processes to work, I joined them up with the best way to automate them; cron jobs.
“rsync + cron = rule”
As I was twittering my investigations of automating my backups, I got a reply saying that rsync is “the cat’s ass“…and it really is. Since setting it up, I always have a current copy of what is on my host (even from the root level!) backed up on my local hard drive. Even the data backup utility by mediatemple couldn’t touch how awesome this was.
rsync is really the way to go when cloning your stuff on the web as it doesn’t backup from scratch. Instead, rsync synchronizes the destination with the source and does so in a smart manner only replacing/copying files that are new or different, and by doing so, it saves a lot of time.
How to setup rsync + cron
Note: the following tutorial is applies to Mac OS X 10.5 and is for advanced users. It also requires some knowledge with the terminal. If you don’t understand any of the steps, feel free to leave a question in the comments and I’ll try my best to answer it.
- The first step you’ll need to take is enabling ssh access on your grid-server. You can find out how to do this on Media Temple’s Knowledgebase.
- Once you can successfully ssh into your grid-server, open up your favorite text editor and type in the following lines:
rsync -aze ssh email@example.com:/home/99999/ [insert destination path]
Obviously you need to do some editing so that it works with your account…
- Replace the domain.com with your primary grid-server domain
- Replace the two instances of 99999 with your gridserver account number
- Insert a destination path in place of [insert destination path]. An example of a destination path is:
- Save this file as something to the effect of: name.command
This will save it as a script which should automatically open in the terminal and execute the rsync command.
- Try running this file, if you get prompted to enter your account password, it means you’ve definitely entered the first part correctly. Enter your password and wait. If you get the message [Process Completed] with no major errors, it probably means everything worked out great. Navigate to the destination path you set and if it looks like a mirror of your grid-server, everything did work.
At this point, we have rsync working, so now we will get the other part working; automating it through a cron job.
- To automate this through a cron job, we will need to create a file which will be our crontab (short for cron table). This file is essentially the schedule that the cron process refers to on what tasks to do (feel free to refer to this article if you’d like more info).
To make setting up this file easier, download this file which will act as your template for your crontab.
- After downloading the sample crontab in the previous step, open the file (sample-crontab) in your favorite text editor. When it’s open, you should see the following lines:
#min hour mday month wday command
00 15 * * * open /Volumes/Macintosh HD/Users/samlu/Backups/mt.command
The way this is actually set up is like a table. Disregarding the second line, you will notice that the first column will be the minute, followed by the hour, day, month, weekday, and command. Currently, the job I have scheduled runs at 3:00pm every day and it opens up my script mt.command.
For simplicity’s sake, what you want to change here is the path to your script (remember, it is an absolute path so you will need to start from /Volumes like I did). If you’d like to change the frequency and/or time this job runs, please refer to the article I linked to earlier.
- If you haven’t already, move this sample-crontab file into your home folder (the same location your terminal defaults to when you open it).
- Now we will “register” this schedule to the cron process. Open up a new terminal window and since this file should be in your home folder, just type the following command:
Now the job should be scheduled and at 3:00pm everyday a terminal will open up asking you for your grid-server account password so that you can establish the SSH connection.
If all went without a hitch, all the files on your grid-server will now be automatically backed up! There is one thing to note here though; if you have any databases setup, they are not yet being backed up only with what we set up here. Stay tuned for the next part of this tutorial when I outline how to automate mySQL database backups on the server side and then will be included as part of your rsync backups.