Recent advances in machine learning (ML), including in image processing and analysis, have made the technology extremely useful for enterprise architects. In my previous article, I shared three ways architects can use machine learning.
In this article, I will describe an intelligent application ecosystem that uses convolutional neural networks (CNN) to process labeled X-ray images for pneumonia risk detection. This project uses ML workloads on Red Hat OpenShift to help healthcare professionals triage patients with a risk of pneumonia based on the ML model's prediction.
[ Learn how to accelerate machine learning operations (MLOps) with Red Hat OpenShift. ]
Introduction
The following diagram depicts the different entities interacting in the end-to-end process and the various software components required to implement the showcase.
The image shows the project's vision for a collaboration between X-ray technicians at several hospitals and a research center. The research center uses the hospitals' data to build an ML model that can predict a patient's pneumonia risk. Doctors then use the risk assessment to perform patient triage.
Create an ML model to satisfy disparate users
My Git repo describes the steps to implement a pneumonia risk detection ecosystem. When creating the showcase, I focused on the following user stories:
- As a data scientist, I want to develop an image classification model for chest X-ray images using Jupyter Hub (lab and notebooks) as my preferred research environment.
- As a data scientist, I want my model to be deployed quickly so that other applications may use it.
- As a full-stack developer, I want quick access to resources that support the business logic of my applications, including databases, storage, and messaging.
- As a full-stack developer, I want an automated build process to support new releases and code updates as soon as they are available in a Git repository.
- As an operations engineer, I want an integrated monitoring dashboard for new applications available on the production infrastructure.
I'll break down these roles in more detail below.
[ Download Kubernetes Patterns: Reusable elements for designing cloud-native applications. ]
Data scientists
Data scientists are the domain experts for designing and building machine learning models. These models are created for a specific purpose by providing the available data as input for the modeling process. In this case, the data is the X-ray images taken by the hospital technicians.
Data scientists aren't (usually) medical doctors, and they don't (usually) know how to distinguish between a clean X-ray image and one that indicates pneumonia. Therefore, they rely on the medical personnel's expertise to provide a set of images that contain already examined and categorized images of healthy X-rays and another set of X-rays with pneumonia. The data scientists then use their knowledge of building image classification machine learning models (in this case, using convolutional neural networks) that have a high accuracy in predicting whether an image shows signs of pneumonia.
Once data scientists build a satisfactory model, they next want to deploy that model into a production system so that it may be used. They also want this process to be as simple as possible since data scientists aren't usually knowledgeable about application deployment processes.
[ What is edge machine learning? ]
Full-stack developers
Full-stack developers are "generalist" software developers. They create an application that exposes the data scientist's prediction results from an ML model to a medical doctor who can review the prediction results and prescribe treatment for the patient.
To create an application with a frontend and backend, the developers must integrate several software components, including:
- Messaging for passing along information such as the X-ray image metadata from the hospital storage systems to the machine learning model
- Databases for storing the prediction results and any other information required by the frontend application to display predictions to doctors and any user information for accessing the frontend application
- Storage for storing processed images
This is a complex ecosystem, and successful software development depends on quick access to these base resources.
[ Try OpenShift Data Science in our Developer sandbox or in your own cluster. ]
Operations engineers
Operations engineers want to know how the application ecosystem behaves once it is live in a production environment. They want to know if there is a problem or to have access to data to pass along to other specialists to determine whether there is a problem with the software system. For this purpose, they need centralized instrumentation on the software components through monitoring tools and dashboards.
The architect user story
When designing a solution, an architect needs to be aware of the end users using the software and the different personas involved in the development and maintenance processes. As part of the software development process, an architect considers the technologies required to build an optimal solution and how these technologies are available to the software developers, testers, and other stakeholders (including data scientists and operations engineers).
Therefore, selecting the right platform to develop and deploy the different software components is critical. As a Kubernetes platform, Red Hat OpenShift can host and provide seamless access to the various resource types typically required to build a complex software solution. It also simplifies the integration of these resources with the other software components. Finally, it answers the demands of the different stakeholders involved in the software solution's development, deployment, and maintenance.
For a full step-by-step guide on building this model on OpenShift, please check out my Git repo.
저자 소개
Arthur is a senior data scientist specialist solution architect at Red Hat Canada. With the help of open source software, he is helping organizations develop intelligent application ecosystems and bring them into production using MLOps best practices.
He has over 15 years of experience in the design, development, integration, and testing of large-scale service enablement applications.
Arthur is pursuing his PhD in computer science at Concordia University, and he is a research assistant in the Software Performance Analysis and Reliability (SPEAR) Lab. His research interests are related to AIOps, with a focus on performance and scalability optimization.
유사한 검색 결과
Resilient model training on Red Hat OpenShift AI with Kubeflow Trainer
Red Hat to acquire Chatterbox Labs: Frequently Asked Questions
Technically Speaking | Platform engineering for AI agents
Technically Speaking | Driving healthcare discoveries with AI
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래