Glance and CEPH backend
Using CEPH as a backend for glance images has slowly become the default deployment methodology in many production deployments. It is usually as easy as creating a new pool in ceph ( glance pool) and creating a user to be associated with glance. The glance CEPH user will normally authenticate using cephx and store images and snapshots in CEPH.
The configuration in glance-api.conf looks something like this on the controller/s
[glance_store] stores = glance.store.rbd.Store default_store = rbd rbd_store_pool = POOLNAME rbd_store_user = USERNAME rbd_store_ceph_conf = /etc/ceph/ceph.conf
then in the default location of the CEPH keyrings “/etc/ceph”, you will need to add the keyring for the CEPH user associated with glance.
On CEPH nodes, don’t forget to grant permissions to the glance user to the glance pool. The permissions need to be read/write such that it can create new images and read the existing ones. The default command to create a new user and grant read/write permissions to the pool is:
ceph auth get-or-create client.user mon ‘allow r’ osd ‘allow class-read object_prefix rbd_children, allow rwx pool=glancepool’ -o /etc/ceph/ceph.client.images.keyring
If after you create the pool and configure glance-api and the keyring properly on the controller node you get something like this in nova-conductor.log when provisioning new VMs
WARNING nova.scheduler.utils [req-a3c6f93e-484a-43e0-9e73-5bbdc451b2c6 - - -] Failed to compute_task_build_instances: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance ID. Last exception: HTTPInternalServerError (HTTP 500) ERROR nova.scheduler.utils [req-a3c6f93e-484a-43e0-9e73-5bbdc451b2c6 - - -] [instance: ID] Error from last host: HOST (node HOST): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1780, in _do_build_and_run_instance\n filter_properties)\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2016, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance ID was re-scheduled: HTTPInternalServerError (HTTP 500)\n']
check the glance logs as well, you will most likely find a 500 error in glance logs (api.log)
INFO eventlet.wsgi.server [req-a3c6f93e-484a-43e0-9e73-5bbdc451b2c6 ] IP "GET /v2/images/2IDfile HTTP/1.1" 500 139 0.182237
This error is not very indicative, however it means that you would want to check the permissions on the keyring file for the glance user in the /etc/ceph and make sure that the user running glance (by default “glance” ) has read permissions to it at least
For example
ls -l /etc/ceph/ceph.client.glancepool.keyring -r--r----- 1 glance glance 64 Oct 27 11:12 /etc/ceph/ceph.client.glancepool.keyring
This permission is the minimum that glance can access the glance pool in CEPH
Have fun !
Leave a Reply