Cephadm: Good bye ceph-deploy

As you probably may know, ceph-deploy, the beloved deployment utility for CEPH, is no longer maintained. Cephadm is the new tool/package to deploy CEPH clusters.

CERN has a pretty good intro PDF to it.  Cephadm includes many nice features including the ability to adopt running CEPH clusters.

Two quick notes that’ll save you some time

  • While adding hosts using the
ceph orch host add hostname

You need to specify the IP of the host as follows

ceph orch host add hostname IP.IP.IP.IP

if you get the following error, despite injecting the ssh keys correctly

Error ENOENT: Failed to connect to hostname (hostname).  Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run: 
> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get mgr/cephadm/ssh_identity_key) root@hostname
  • When adding OSDs

If you are deploying a cluster with a “relatively” moderate number of OSDs per host, you may run into the following error scenario while using:

 ceph orch apply osd --all-available-devices

The command basically adds the available hdds/ssds to be part of your cluster. Under the hood, this is done by running a docker container that’s in charge of that OSD. Basically the following command is run

/bin/bash /var/lib/ceph/{FSID}/osd.{NUM}/unit.run

It does that for every available OSD in your hosts. You may find that some of the OSDs don’t start and are stuck in error start despite your efforts to use

ceph orch daemon restart osd.xx

If you dig deeper (by executing the docker shell directly or looking into the logs) , you will find the following self-explanatory error

 /var/lib/ceph/osd/ceph-xx/block) _aio_start io_setup(2) failed with EAGAIN; try increasing /proc/sys/fs/aio-max-nr

The solution is simply to set the asynchronous non-blocking IO into a higher value using

sudo sysctl -w fs.aio-max-nr=1048576

If that solves your issue, apply it to sysctl.conf to persist

Happy cephadmining 🙂




Leave a Reply

Your email address will not be published. Required fields are marked *