Blog

Backing up your virtual machines in Openstack

Backing up your virtual machines in Openstack

Backup is an essential part of the IT infrastructure management. Having HA solutions, RAIDs etc. doesn't free you from the need of backup. In case of a human error all those techniques will not save you, only the backup will. However as the saying goes "Your backups are only as good as your restores", so we have to think about regularly checking our backups for consistency.

In Openstack it's highly recommended to use Cinder as the main storage provider. Cinder gives you the possibility to create block volumes and attach them to your virtual machines. The best practice is that you keep all your application data onto volumes and not on the instance disk, this disk should be used for the operation system files only, that come from the OS image ( of course packages installed from repositories will also go there) . In this article we will show you more reasons to do so.
What you would typically want from a good backup solution is: online backup possibility, easy restores, consistency, easy management, to use as less space as possible.
Although it's possible to have a traditional backup solution installed on every virtual machine, Openstack offers us other options to backup our data using snapshotting. The downside is that you can't have an "incremental" snapshot copy yet, you have to store the full size of your snapshots every time you backup. However the simplicity of backups and more importantly restores is far greater than supporting a "in-VM" backup solution that supports incremental backups. 

The Openstack Kilo environment for which we established this backup strategy is a 2 hypervisor setup, with NFS as the Cinder backend. Don't ask me why we chose NFS, that was the situation back in the day. :)

Openstack offers the command "cinder backup-create" precisely for the purpose of backups, this command actually makes a copy of the volume and sends it to the object store (swift). Unfortunately we have to wait till Liberty release to get the force option which allows online backups, currently backup only works on not attached volumes.

 cinder backup-create bata-mysqlsflr02_app
 ERROR: Invalid volume: Volume to be backed up must be available (HTTP 400) (Request-ID: req-6aeb81f1-c415-45ed-8d58-20f466055f03)
 
In Liberty the command will look like this and it will allow for incremental backups and the force option which will allow also online backups.
  cinder backup-create [--incremental] [--force] VOLUME
  
Awesome, right!! However we will have to wait.

Until then we'll be using another approach. There is the cinder snapshot function, unfortunately it doesn't support the NFS storage back-end.

http://docs.openstack.org/admin-guide/blockstorage_volume_backed_image.html

When I thought all hope was lost, the cinder upload-to-image option came to my mind. We actually have the possibility to create a glance image from a cinder volume, than we can download that image to an external file stored on a safe location. The sweet part is that we can do all this online and with consistent data, also the size when using qcow2 format will be only the used space from the volume.

To ensure the consistency we made the following test, we create one file in the volume before we start the image upload, then two files during the time the image is being created, one at the start of image creation and one at about the end.

On the source machine we create the test file

 # touch beforeimage.tst

We trigger the image creation
 cinder upload-to-image svnsfphp03_app SVN_app-volume-backup --disk-format qcow2 --container-format bare --force True
 glance image-list
 ..
 | 57e460f3-4674-435d-b92a-b18f126850a3 | SVN_app-volume-backup | qcow2 | bare | | queued |
 
Then we create two more files on the source machine the first imeediately after we trigger, the second one a little before the end of the operation.
 touch afterimagestart.tst
 touch afterimagestart-2.tst


To come back to the point that your backups are only as good as your restores we test the restore this way. The result shows that we only see the file we have created before we started the image, so the snapshotting works, we get the state by when we started the image-upload. 

  cinder create --display-name SVN_app-volume-backup-restore-test --availability-zone nova  --image-id 57e460f3-4674-435d-b92a-b18f126850a3  5
  
When mounted we see only the file made before the trigger of the image upload.
 # mount /dev/vdr /mnt/
 # ls -la /mnt/
 total 32
 drwxr-xr-x. 4 root root 4096 Jul 27 09:31 .
 drwxr-xr-x. 18 root root 4096 Jun 29 07:19 ..
 -rw-r--r--. 1 root root 0 Jul 27 09:31 beforeimage.tst
 drwxr-xr-x. 7 48 48 4096 Jul 27 09:30 bis
 drwx------. 2 root root 16384 Apr 20 11:01 lost+found
 -rw-r--r--. 1 root root 10 Jul 6 11:58 test.txt

To complete the full backup procedure we should export the image from glance to a file, stored on a safe location, ideally outside your data center. The tricky part is that the upload-to-image command is asynchronous, which means that the command will not actually wait until the image is uploaded in glance before it exits, so you either have to poll the API constantly in your backup script to see when is it ready so you can download it, or split your script in two phases, once you make all the images, then you download them.
  glance image-download SVN_app-volume-backup --file /mnt/remote-nfs/SVN-volume-backup-${DATE}.qcow2

  
Then if a restore is needed we can do it by creating an image in glance from that file than again creating a volume from that image.

In our setup, we wanted to also backup the OS images, unfortunately it seems there is still no online solution to do this. This is the main reason the best practice to be to keep your precious data on cinder volumes as stated in the beginning of the article. The destination VM is always frozen for the time the snapshot of the machine is made. The way nova works is to make a copy of the disk during freeze time in the /var/lib/nova/instances/snapshots/ directory, than it unfreezes the machine and starts the upload to glance process, which at least is during the machine is fully operational. I hope we'll have live snapshotting soon, good thing is that in our case the machines we backup this way are not business critical, so a pause time of a minute during the night is acceptable. 

Here is the script that we use, typically it takes a while to complete for large and many machines. It also depends on your storage speed, so if you use it in your setup better time it to know how much it takes and schedule appropriately. 

 #!/bin/bash
 DATE=`date +%y%m%d`
 VOLSUFFIX="_app"
 BACKUPDIR=/mnt/usbdisk/backup
 
 declare -a VMS=(
 vm1
 vm2
 vm3
 )
 
 source /root/keystonerc_admin
 
 echo $DATE `date` Starting Backup process of images
 
 for vmname in "${VMS[@]}"
 
 do
  echo Backing up $vmname
  echo cinder upload-to-image ${vmname}${VOLSUFFIX} ${vmname}${VOLSUFFIX}-volume-backup --disk-format qcow2 --container-format bare --force True
  cinder upload-to-image ${vmname}${VOLSUFFIX} ${vmname}${VOLSUFFIX}-volume-backup --disk-format qcow2 --container-format bare --force True
  echo nova image-create --poll $vmname ${vmname}-backup
  nova image-create --poll $vmname ${vmname}-backup
  echo NO support for NFS volume snapshot yet
  date
  echo glance image-download ${vmname}-backup --file $BACKUPDIR/${vmname}-backup-${DATE}.qcow2
  glance image-download ${vmname}-backup --file $BACKUPDIR/${vmname}-backup-${DATE}.qcow2
  echo nova image-delete ${vmname}-backup
  nova image-delete ${vmname}-backup
  date
  echo glance image-download ${vmname}${VOLSUFFIX}-volume-backup --file $BACKUPDIR/${vmname}${VOLSUFFIX}-volume-backup-${DATE}.qcow2
  glance image-download ${vmname}${VOLSUFFIX}-volume-backup --file $BACKUPDIR/${vmname}${VOLSUFFIX}-volume-backup-${DATE}.qcow2
  echo glance image-delete ${vmname}${VOLSUFFIX}-volume-backup
  glance image-delete ${vmname}${VOLSUFFIX}-volume-backup
 done


Please send us your comments and questions. I would really appreciate if someone shares a way to do real online snapshots of the main OS disk images without freeze time.

In our next blog for this subject I will show you how we backup the Openstack components themselves.