Recent advances in machine learning (ML), including in image processing and analysis, have made the technology extremely useful for enterprise architects. In my previous article, I shared three ways architects can use machine learning.

In this article, I will describe an intelligent application ecosystem that uses convolutional neural networks (CNN) to process labeled X-ray images for pneumonia risk detection. This project uses ML workloads on Red Hat OpenShift to help healthcare professionals triage patients with a risk of pneumonia based on the ML model's prediction.

[ Learn how to accelerate machine learning operations (MLOps) with Red Hat OpenShift. ]

Introduction

The following diagram depicts the different entities interacting in the end-to-end process and the various software components required to implement the showcase.

Components of the ML solution

The image shows the project's vision for a collaboration between X-ray technicians at several hospitals and a research center. The research center uses the hospitals' data to build an ML model that can predict a patient's pneumonia risk. Doctors then use the risk assessment to perform patient triage.

Create an ML model to satisfy disparate users

My Git repo describes the steps to implement a pneumonia risk detection ecosystem. When creating the showcase, I focused on the following user stories:

  • As a data scientist, I want to develop an image classification model for chest X-ray images using Jupyter Hub (lab and notebooks) as my preferred research environment.
  • As a data scientist, I want my model to be deployed quickly so that other applications may use it.
  • As a full-stack developer, I want quick access to resources that support the business logic of my applications, including databases, storage, and messaging.
  • As a full-stack developer, I want an automated build process to support new releases and code updates as soon as they are available in a Git repository.
  • As an operations engineer, I want an integrated monitoring dashboard for new applications available on the production infrastructure.

I'll break down these roles in more detail below.

[ Download Kubernetes Patterns: Reusable elements for designing cloud-native applications. ]

Data scientists

Data scientists are the domain experts for designing and building machine learning models. These models are created for a specific purpose by providing the available data as input for the modeling process. In this case, the data is the X-ray images taken by the hospital technicians.

Data scientists aren't (usually) medical doctors, and they don't (usually) know how to distinguish between a clean X-ray image and one that indicates pneumonia. Therefore, they rely on the medical personnel's expertise to provide a set of images that contain already examined and categorized images of healthy X-rays and another set of X-rays with pneumonia. The data scientists then use their knowledge of building image classification machine learning models (in this case, using convolutional neural networks) that have a high accuracy in predicting whether an image shows signs of pneumonia.

Once data scientists build a satisfactory model, they next want to deploy that model into a production system so that it may be used. They also want this process to be as simple as possible since data scientists aren't usually knowledgeable about application deployment processes.

What is edge machine learning? ]

Full-stack developers

Full-stack developers are "generalist" software developers. They create an application that exposes the data scientist's prediction results from an ML model to a medical doctor who can review the prediction results and prescribe treatment for the patient.

To create an application with a frontend and backend, the developers must integrate several software components, including:

  • Messaging for passing along information such as the X-ray image metadata from the hospital storage systems to the machine learning model
  • Databases for storing the prediction results and any other information required by the frontend application to display predictions to doctors and any user information for accessing the frontend application
  • Storage for storing processed images

This is a complex ecosystem, and successful software development depends on quick access to these base resources.

[ Try OpenShift Data Science in our Developer sandbox or in your own cluster. ]

Operations engineers

Operations engineers want to know how the application ecosystem behaves once it is live in a production environment. They want to know if there is a problem or to have access to data to pass along to other specialists to determine whether there is a problem with the software system. For this purpose, they need centralized instrumentation on the software components through monitoring tools and dashboards.

The architect user story

When designing a solution, an architect needs to be aware of the end users using the software and the different personas involved in the development and maintenance processes. As part of the software development process, an architect considers the technologies required to build an optimal solution and how these technologies are available to the software developers, testers, and other stakeholders (including data scientists and operations engineers).

Therefore, selecting the right platform to develop and deploy the different software components is critical. As a Kubernetes platform, Red Hat OpenShift can host and provide seamless access to the various resource types typically required to build a complex software solution. It also simplifies the integration of these resources with the other software components. Finally, it answers the demands of the different stakeholders involved in the software solution's development, deployment, and maintenance.

For a full step-by-step guide on building this model on OpenShift, please check out my Git repo.


저자 소개

Arthur is a senior data scientist specialist solution architect at Red Hat Canada. With the help of open source software, he is helping organizations develop intelligent application ecosystems and bring them into production using MLOps best practices.

He has over 15 years of experience in the design, development, integration, and testing of large-scale service enablement applications.

Arthur is pursuing his PhD in computer science at Concordia University, and he is a research assistant in the Software Performance Analysis and Reliability (SPEAR) Lab. His research interests are related to AIOps, with a focus on performance and scalability optimization.

UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Virtualization icon

가상화

온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래