Red Hat Ceph Storage 5 introduces cephadm, a new integrated control plane that is part of the storage system itself, and enjoys a complete understanding of the cluster’s current state — something that external tools could not quite achieve as well because of their external nature. Among its many advantages, cephadm unified control of the state of a storage cluster significantly simplifies operations.
Replacing failed drives made easy
For example, the older process to replace drives in ceph-ansible required multiple steps and running processes enforcing configuration on all nodes when what was desired was updating only one node’s configuration. Managing around drive encryption would at times involve further complexity.
New ways: replacing a failed drive with cephadm
When a drive eventually fails, the OSD of that drive needs to be removed from the cluster. This command removes the OSD from a cephadm-managed cluster:
ceph orch osd rm <svc_id(s)> --replace
This command evacuates remaining placement groups from the cluster and marks the OSD as scheduled for replacement while keeping this OSD in the CRUSH hierarchy.
On supported hardware enclosures, the system can also blink the drive’s LED to help the administrator locate the specific disk that failed: ceph device light on|off <devid>
Where <devid> is a device id that can be obtained by the command
ceph device ls
If the OSD was created by cephadm
, recreating the OSD will be done automatically as soon as a new drive gets inserted. cephadm is aware of the at-rest disk encryption setup if one is present, and will transparently negotiate with the monitors to use the appropriate keys when encrypting a new drive. That’s it. The replacement process is complete.
If OSD was created manually or by ceph-ansible
, cephadm
needs to be told how to recreate that OSD by applying an OSD specification like the following:
service_type: osd service_id: osd placement: hosts: - myhost data_devices: paths: - /path/to/the/device
But that is not the entire story. The same process can also be managed from the management UI in interactive, step-by step fashion.
Replacing a Failed OSDs from the Dashboard
A failed OSDs in a Ceph Storage cluster can also be replaced by a junior administrator with appropriate role-based access control (RBAC) permissions on the Dashboard. OSD IDs can be preserved while replacing the failed OSDs, which is both operationally easier to manage (by having a fixed set of ID assigned to each host) and optimizes memory usage (OSD ID gaps are undesirable).
The Cluster administrator can thus use the Dashboard’s RBAC capabilities to delegate a trainee to replace failed drives, without delegating additional permissions that the junior administrator is not yet qualified to operate, as detailed in the following short video.
Sobre los autores
Federico Lucifredi is the Product Management Director for Ceph Storage at Red Hat and a co-author of O'Reilly's "Peccary Book" on AWS System Administration.
Ernesto Puerta Treceno is a Principal Software Engineer at Red Hat.
Paul Cuzner is a Principal Software Engineer working within Red Hat's Cloud Storage and Data Services team. He's has more than 25 years of experience within the IT industry, encompassing most major hardware platforms from IBM mainframe to commodity x86 servers. Since joining Red Hat in 2013, Cuzner's focus has been on applying his customer and solutions-oriented approach to improving the usability and customer experience of Red Hat's storage portfolio.
Cuzner lives with his wife and son in New Zealand, where he can be found hacking on Ceph during the week and avoiding DIY jobs around the family home on weekends.
Navegar por canal
Automatización
Las últimas novedades en la automatización de la TI para los equipos, la tecnología y los entornos
Inteligencia artificial
Descubra las actualizaciones en las plataformas que permiten a los clientes ejecutar cargas de trabajo de inteligecia artificial en cualquier lugar
Nube híbrida abierta
Vea como construimos un futuro flexible con la nube híbrida
Seguridad
Vea las últimas novedades sobre cómo reducimos los riesgos en entornos y tecnologías
Edge computing
Conozca las actualizaciones en las plataformas que simplifican las operaciones en el edge
Infraestructura
Vea las últimas novedades sobre la plataforma Linux empresarial líder en el mundo
Aplicaciones
Conozca nuestras soluciones para abordar los desafíos más complejos de las aplicaciones
Programas originales
Vea historias divertidas de creadores y líderes en tecnología empresarial
Productos
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Servicios de nube
- Ver todos los productos
Herramientas
- Training y Certificación
- Mi cuenta
- Soporte al cliente
- Recursos para desarrolladores
- Busque un partner
- Red Hat Ecosystem Catalog
- Calculador de valor Red Hat
- Documentación
Realice pruebas, compras y ventas
Comunicarse
- Comuníquese con la oficina de ventas
- Comuníquese con el servicio al cliente
- Comuníquese con Red Hat Training
- Redes sociales
Acerca de Red Hat
Somos el proveedor líder a nivel mundial de soluciones empresariales de código abierto, incluyendo Linux, cloud, contenedores y Kubernetes. Ofrecemos soluciones reforzadas, las cuales permiten que las empresas trabajen en distintas plataformas y entornos con facilidad, desde el centro de datos principal hasta el extremo de la red.
Seleccionar idioma
Red Hat legal and privacy links
- Acerca de Red Hat
- Oportunidades de empleo
- Eventos
- Sedes
- Póngase en contacto con Red Hat
- Blog de Red Hat
- Diversidad, igualdad e inclusión
- Cool Stuff Store
- Red Hat Summit