At first glance, the idea of a self-healing IT infrastructure may seem intimidating. Can any infrastructure be simple and straightforward enough to fit into a cookie-cutter format? When you have a complex infrastructure, how can you think about software fixing other software by introducing self-healing?

I'll break down the self-healing concept into a simple form and try to decrease some of the intimidation. Then I'll build upon that example to show how it scales.

The four key components of a self-healing infrastructure are:

  • Current state: How do you monitor or stream the environment's current state?
  • Baseline: How do you compare your current state against a certain set of standards (such as risk, vulnerability, or compliance)?
  • Remedy: If the current state does not match the baseline standards, what remediates the drift or vulnerable state?
  • Automate: How do you automate the remediation process? It wouldn't be self-healing if there weren't an automated remediation, now would it?

A simple self-healing infrastructure

With those pieces defined, I'll provide a simple example of a self-healing infrastructure using a Red Hat architecture.

A simple self-healing infrastructure

I'll walk through this simplified example architecture, starting at the Red Hat platforms component, which includes a Red Hat Enterprise Linux (RHEL) client system. These systems are active, so I can monitor their current state using Red Hat Smart Management (which you may know as Red Hat Satellite). It gathers running data using the Red Hat Insights client on each system.

After anonymizing the data, it's sent to the hosted Red Hat Insights software, which compares the data against the baselines. These baselines are designed around rules established using Red Hat's knowledgebase and a set of user-defined policies. Comparing the baselines to the current state can determine if there are any red flags, such as active vulnerabilities, compliance gaps, or incomplete patches.

If Insights finds any concerns, the hosted services determine a remediation strategy and design solutions for the discovered issues. It returns the solutions as Ansible playbooks, enabling an automated recovery to equalize the current state with the set baselines.

[ Learn how to modernize your IT with managed cloud services in this free eBook. ]

Can it scale?

OK, not so bad up to this point, right? But that scenario is simple, generic, and static. It's not very realistic, as systems can be hosted in different locations and may change dynamically.

For example, most organizations today spread development, quality engineering, and production environments across a hybrid cloud environment, with some systems on-premises while others are in the cloud. Not just that, but business fluctuates, and peak seasons require large deployments fast. Once a peak season ends, infrastructure needs to scale down quickly to the typical level to manage costs and resources.

Fortunately, the same architecture I described above also manages scaling environments across a hybrid cloud estate. Whether you have systems on-premises, geographically far away (maybe in the public or private cloud), or any combination, you can achieve this elasticity.

This self-healing architecture can handle the demands of a complex, scaling environment like it did in the simple example above, as everything flows through that same channel. You can also leverage Insights' Drift service, which runs a system comparison to keep systems in sync, even if they're running in different environments.

Wrap up

I hope this simple, generalizable example makes the idea of a self-healing infrastructure less intimidating. If you can achieve all four components of a self-healing infrastructure in a simple environment, there's no reason you can't scale the solution to meet your needs, regardless of the infrastructure's complexity.

If you would like to learn more, check out Red Hat's Portfolio Architecture Center, where you can find our architectures on self-healing architectures and other use cases. Our team's goal is to take complex customer deployments, simplify them so they are easily digestible, and share them back out with the community to continue the innovation chain. If you'd like to learn more or have some ideas of your own, reach out to us; we'd love your feedback!


关于作者

Camry Fedei joined Red Hat in 2015, starting in Red Hat's support organization as a Support Engineer before transitioning to the Customer Success team as a Technical Account Manager. He then joined the Management Business Unit in Technical Marketing to help deliver a number of direct solutions most relevant to Red Hat's customers.

UI_Icon-Red_Hat-Close-A-Black-RGB

按频道浏览

automation icon

自动化

有关技术、团队和环境 IT 自动化的最新信息

AI icon

人工智能

平台更新使客户可以在任何地方运行人工智能工作负载

open hybrid cloud icon

开放混合云

了解我们如何利用混合云构建更灵活的未来

security icon

安全防护

有关我们如何跨环境和技术减少风险的最新信息

edge icon

边缘计算

简化边缘运维的平台更新

Infrastructure icon

基础架构

全球领先企业 Linux 平台的最新动态

application development icon

应用领域

我们针对最严峻的应用挑战的解决方案

Virtualization icon

虚拟化

适用于您的本地或跨云工作负载的企业虚拟化的未来