Backup is an essential part of IT infrastructure management. Having HA solutions, RAIDs, etc. doesn’t free you from the need for backup. In case of a human error all those techniques will not save you, only the backup will.
However, as the saying goes “Your backups are only as good as your restores”, so we have to think about regularly checking our backups for consistency. In OpenStack, it’s highly recommended to use Cinder as the main storage provider. Cinder gives you the possibility to create block volumes and attach them to your virtual machines. The best practice is that you keep all your application data onto volumes and not on the instance disk, this disk should be used for the operating system files only, that come from the OS image ( of course packages installed from repositories will also go there). In this article, we will show you more reasons to do so. What you would typically want from a good backup solution is an online backup possibility, easy restores, consistency, and easy management, to use as little space as possible. Although it’s possible to have a traditional backup solution installed on every virtual machine, OpenStack offers us other options to back up our data using snapshotting. The downside is that you can’t have an “incremental” snapshot copy yet, you have to store the full size of your snapshots every time you backup. However, the simplicity of backups and more importantly restores is far greater than supporting an “in-VM” backup solution that supports incremental backups. 27JULThe OpenStack Kilo environment for which we established this backup strategy is a 2 hypervisor setup, with NFS as the Cinder backend. Don’t ask me why we chose NFS, that was the situation back in the day. 🙂
OpenStack offers the command “cinder backup-create” precisely for the purpose of backups, this command actually makes a copy of the volume and sends it to the object store (swift). Unfortunately, we have to wait till Liberty releases to get the force option which allows online backups, currently backup only works on not attached volumes.
cinder backup-create bata-mysqlsflr02_app
ERROR: Invalid volume: Volume to be backed up must be available (HTTP 400) (Request-ID: req-6aeb81f1-c415-45ed-8d58-20f466055f03)
In Liberty, the command will look like this and it will allow for incremental backups and the force option which will allow also online backups.
cinder backup-create [--incremental] [--force] VOLUME
Awesome, right!! However, we will have to wait.
Until then we’ll be using another approach. There is the cinder snapshot function, unfortunately, it doesn’t support the NFS storage back-end.
When I thought all hope was lost, the cinder upload-to-image option came to my mind. We actually have the possibility to create a glance image from a cinder volume, then we can download that image to an external file stored in a safe location. The sweet part is that we can do all this online and with consistent data, also the size when using qcow2 format will be only the used space from the volume.
To ensure consistency we made the following test, we create one file in the volume before we start the image upload, then two files during the time the image is being created, one at the start of image creation and one at about the end.
On the source machine, we create the test file.
# touch beforeimage.tst
We trigger the image creation
cinder upload-to-image svnsfphp03_app SVN_app-volume-backup --disk-format qcow2 --container-format bare --force True
glance image-list
..
| 57e460f3-4674-435d-b92a-b18f126850a3 | SVN_app-volume-backup | qcow2 | bare | | queued |
Then we create two more files on the source machine the first immediately after we trigger, the second one a little before the end of the operation.
touch afterimagestart.tst
touch afterimagestart-2.tst
To come back to the point that your backups are only as good as your restores we test the restore this way. The result shows that we only see the file we created before we started the image, so the snapshotting works, we get the state by when we started the image upload.
cinder create --display-name SVN_app-volume-backup-restore-test --availability-zone nova --image-id 57e460f3-4674-435d-b92a-b18f126850a3 5
When mounted we see only the file made before the trigger of the image upload.
# mount /dev/vdr /mnt/
# ls -la /mnt/
total 32
drwxr-xr-x. 4 root root 4096 Jul 27 09:31 .
drwxr-xr-x. 18 root root 4096 Jun 29 07:19 ..
-rw-r--r--. 1 root root 0 Jul 27 09:31 beforeimage.tst
drwxr-xr-x. 7 48 48 4096 Jul 27 09:30 bis
drwx------. 2 root root 16384 Apr 20 11:01 lost+found
-rw-r--r--. 1 root root 10 Jul 6 11:58 test.txt
To complete the full backup procedure we should export the image from glance to a file, stored in a safe location, ideally outside your data center. The tricky part is that the upload-to-image command is asynchronous, which means that the command will not actually wait until the image is uploaded in a glance before it exits, so you either have to poll the API constantly in your backup script to see when is it ready so you can download it, or split your script into two phases, once you make all the images, then you download them.
glance image-download SVN_app-volume-backup --file /mnt/remote-nfs/SVN-volume-backup-${DATE}.qcow2
Then if a restore is needed we can do it by creating an image in a glance from that file then again creating a volume from that image.
In our setup, we wanted to also back up the OS images, unfortunately, it seems there is still no online solution to do this. This is the main reason the best practice is to keep your precious data on cinder volumes as stated at the beginning of the article. The destination VM is always frozen for the time the snapshot of the machine is made. The way nova works is to make a copy of the disk during freeze time in the /var/lib/nova/instances/snapshots/ directory, then it unfreezes the machine and starts the upload to glance process, which at least is during the machine is fully operational. I hope we’ll have live snapshotting soon, good thing is that in our case the machines we back up this way are not business-critical, so a pause time of a minute during the night is acceptable.
Here is the script that we use, typically it takes a while to complete for large and many machines. It also depends on your storage speed, so if you use it in your setup better time it to know how much it takes and schedule appropriately.
#!/bin/bash
DATE=`date +%y%m%d`
VOLSUFFIX="_app"
BACKUPDIR=/mnt/usbdisk/backup
declare -a VMS=(
vm1
vm2
vm3
)
source /root/keystonerc_admin
echo $DATE `date` Starting Backup process of images
for vmname in "${VMS[@]}"
do
echo Backing up $vmname
echo cinder upload-to-image ${vmname}${VOLSUFFIX} ${vmname}${VOLSUFFIX}-volume-backup --disk-format qcow2 --container-format bare --force True
cinder upload-to-image ${vmname}${VOLSUFFIX} ${vmname}${VOLSUFFIX}-volume-backup --disk-format qcow2 --container-format bare --force True
echo nova image-create --poll $vmname ${vmname}-backup
nova image-create --poll $vmname ${vmname}-backup
echo NO support for NFS volume snapshot yet
date
echo glance image-download ${vmname}-backup --file $BACKUPDIR/${vmname}-backup-${DATE}.qcow2
glance image-download ${vmname}-backup --file $BACKUPDIR/${vmname}-backup-${DATE}.qcow2
echo nova image-delete ${vmname}-backup
nova image-delete ${vmname}-backup
date
echo glance image-download ${vmname}${VOLSUFFIX}-volume-backup --file $BACKUPDIR/${vmname}${VOLSUFFIX}-volume-backup-${DATE}.qcow2
glance image-download ${vmname}${VOLSUFFIX}-volume-backup --file $BACKUPDIR/${vmname}${VOLSUFFIX}-volume-backup-${DATE}.qcow2
echo glance image-delete ${vmname}${VOLSUFFIX}-volume-backup
glance image-delete ${vmname}${VOLSUFFIX}-volume-backup
done
Please send us your comments and questions. I would really appreciate it if someone shares a way to do real online snapshots of the main OS disk images without freeze time.
In our next blog on this subject, I will show you how we back up the OpenStack components themselves.