Red Hat’s approach to site reliability engineering (SRE)

Helping customers save time, move faster, and reduce complexity

Organizations can move more efficiently with cloud services supported by site reliability engineering practices that make IT work at scale.

Jump to section

Why you need an SRE team working for you

A site reliability engineering (SRE) team uses software as a tool to manage systems, solve problems, and automate tasks at scale. When you’ve chosen to use Red Hat® cloud services for app development, you're supported by an SRE team that manages the security, observability, performance, scalability, and cost optimization of your IT systems.

Red Hatter supporting your organization illustration

Dedicated platform management

The Red Hat SRE team takes responsibility for the ongoing management and security of your application platform. This means building new services and features, automating whenever and whatever possible for scalability, and focusing on the ongoing observability and reliability of Red Hat clusters.

The Red Hat SRE team is dedicated to supporting your core platforms, by providing automated insights, improving workload performance, and helping you follow best practices for cloud-native development.

As digital spaces evolve and technologies advance, your resources are better spent on your core competencies and innovation, rather than managing infrastructure. With the backing of the Red Hat SRE team, you can focus on what you do best—the work that represents your true advantages.

Hear from the Red Hat SRE team

How Red Hat’s SRE team helps

The Red Hat SRE team interacts with many teams behind the scenes. We often think of them as a "hub of the wheel," connecting everyone and orchestrating events, both proactively and reactively, to make your experience as seamless as possible.

The SRE team’s responsibilities include:

Building services

  • Manage and monitor Red Hat OpenShift® hosted environments.
  • Develop new features.
  • "Day 1" operations: build and deploy managed clusters.

Automating for scale

  • Automate everything: upgrades, certificate management, capacity scaling, etc.
  • Repeatability manages risk, improves the user experience, and enables faster delivery.

Observability and reliability

  • "Day 2" operations such as lifecycle operations, monitoring, and patching.
  • Proactive and reactive responses to and from customers, partners, cloud providers, and the upstream community.

A day in the life of an SRE

Working behind the scenes, a Red Hat OpenShift cloud services SRE handles a range of essential responsibilities.

Work with leading cloud providers

Our goal is to integrate tools that make your entire system more manageable and efficient—without complicating your IT portfolio. That’s why our SRE team works in tandem with leading cloud providers including Amazon Web Services (AWS), Google Cloud, and Microsoft.

Solutions for multi-cloud environments

Because of our strategic relationships with the AWS, Google, and Microsoft product and engineering teams, we have a unique ability to service our customers’ multi-cloud needs with the major cloud providers. Our customers benefit from the cross-cloud knowledge the Red Hat SRE team gains from managing OpenShift across multiple cloud providers.  

The Red Hat SRE team manages, scales and automates Red Hat applications and data services, and OpenShift clusters as part of the Red Hat Cloud Services portfolio so you can focus on developing applications quickly and reduce the time and costs associated with managing those applications. 

The team’s proactive monitoring and support can provide more secure and stable clusters than if you were responsible for managing our application platform on your own. We can align OpenShift clusters to specific compliance certifications as well as your own security policies.

With the Red Hat SRE team, you can avoid the costs, learning curve, and ongoing resource constraints of doing it all yourself.

How SRE translates to customer value

Explore related resources


Red Hat SRE services


A RedMonk Conversation: SRE at Red Hat


How SRE teams boost DevOps


5 ways site reliability engineers can help you

Explore Red Hat cloud services

Red Hat Cloud Services

Red Hat Cloud Services include hosted and managed platform, application, and data services.

Red Hat OpenShift Service on AWS

Red Hat OpenShift Service on AWS is a fully-managed and jointly supported offering on the AWS public cloud.

Microsoft Azure Red Hat OpenShift

Azure Red Hat OpenShift is a fully-managed service that's jointly engineered, managed, and supported by Microsoft and Red Hat.