VM Cold migrations/resizing in openstack

Cold migrations are an integral piece of any QEMU/KVM deployment. It’s cold or “non-live” as you have to power down the VM, move it to the new host and power it back up. Openstack follows the same procedure when it comes to migrating VMs.

Cold migrations in Openstack are done via the user running the openstack-nova-compute process. This user is “nova” in most cases. In order to allow cold migrations, the user “nova” has to be able to ssh, password-less, to the other compute hosts in the environment. This is why this part of openstack configuration is needed


The basic steps for cold migrations are as follows

  • An admin user initiates the migration using either “openstack server migrate” , “nova migrate” or the dashboard
  • nova-scheduler tries to identify a target host based on the scheduler configuration the VM specs
  • openstack-nova-compute uses the “nova” user to ssh to the target host, create the needed /var/lib/nova/instances directories, copy the VM definitions and the disk over to the target system
  • nova-scheduler now starts the new VM on the new host

A common error that you might see if you have the ssh keys misconfigured is a variation of the following error.

2017-10-17 16:38:27.149 8053 ERROR oslo_messaging.rpc.server ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
2017-10-17 16:38:27.149 8053 ERROR oslo_messaging.rpc.server Command: ssh -o BatchMode=yes IP mkdir -p /var/lib/nova/instances/ID
2017-10-17 16:38:27.149 8053 ERROR oslo_messaging.rpc.server Exit code: 255
2017-10-17 16:38:27.149 8053 ERROR oslo_messaging.rpc.server Stdout: u''
2017-10-17 16:38:27.149 8053 ERROR oslo_messaging.rpc.server Stderr: u'Host key verification failed.\r\n'

To test it, try switching to the nova user on the source compute host using “su – nova” and ssh to the target host.

su - nova
ssh target-host

If you get a similar message

Warning: Permanently added the RSA host key for IP address 'IP' to the list of known hosts.

This indicates that: although you have setup the password-less ssh properly, the host key of the target compute host was not “yet” trusted on the source compute host.  This will mostly happen if you copied the keys manually and did not use the ssh-copy-id command