Red Hat is continually innovating and part of that innovation includes researching and striving to solve the problems our customers face. That innovation is driven through the Office of the CTO and includes Red Hat OpenShift, Red Hat OpenShift Container Storage and innovative projects such as the Open Data Hub. We recently interviewed Juana Nakfour, Senior Software Engineer in the AI Center of Excellence for the office of the CTO at Red Hat, about this very topic.
Open Data Hub Video—Juana Nakfour
Hi Juana. For starters, why don’t you tell us about your project?
So, our project is an open source community project called Open Data Hub. It’s an end-to-end AI/ML platform.
What’s “end-to-end” mean, in this case?
¨End-to-end¨ means we provide all the tools for all the users of an AI/ML platform. From the data engineer to the data scientists to the DevOps and business intelligence users.
Gotcha. What’s it based on?
It's based on all open source projects such as Jupyterhub, Apache Spark, Seldon, Prometheus, Grafana and Argo. So there's a lot of flexibility in getting the community to innovate and contribute It is easy to use, because it's basically very easy to install and straightforward to use those tools. You can download it today, in a Red Hat OpenShift installation, from the catalog.
What do you think led to these projects happening in the first place?
I think behind a lot of challenges today, with regards to the data scientist and developers on AI/ML software engineering, is the fact that there are a lot of different concepts they have to be aware of. Open Data Hub makes it easier for them to just jump in and start writing their code that's AI/ML- specific, and not worry about where this is running, or what containers are running on. These projects help the community more than anything in AI/ML workflow development.
In your own words, can you explain the importance of being open source?
I like the word “open,” meaning you can see every single line of code. And not only can you see it, but you can make improvements. You have the freedom to make improvements. That means you have this large group of people coming together, and that's where innovation happens. They all come together, discuss things, and try to fix things. That's where you see innovation happening for us.
For Red Hat and for Open Data Hub, that's our number one.
What are some other benefits that will come with being open source?
I think the open source part also plays into how flexible the platform is. From an Open Data Hub perspective, or any open source project, most of the time it's modular, so you can take components, and put components in according to your specifications. You pick tools and install them based on your individual requirements and needs.
Operators are the focus, I suppose.
Open Data Hub is a meta-Operator that has a lot of tools packaged together that can easily install an end-to-end AI/ML platform at once. Just the fact that they're modular and all in together, connected, means you can use module A, together with module P together with module E, which makes it easier for data scientists and engineers to develop faster.
These are very complex systems, though.
AI/ML is a complicated system. And there's a lot of modules that need to work together, from the beginning of data ingestion, to the middle of data science and data analysis, to the end of data model serving in DevOps and monitoring. It’s a complicated system, and we're putting it all together and providing it for users.
So like you just said, it sounds quite complex and complicated. How do you make sure that you avoid any glitches and make sure it's all running smoothly?
That was going to be my next question.
First, it's open source. This is community driven. So any issues you have, you can come to the community.
But, we have a monitoring system at the end of the deployment that I mentioned, which actually monitors how your models are performing and gives you feedback to see if there are any issues. And Openshift, as a platform itself, has a lot of monitoring tools that helps DevOps operators figure out where the problem is, where the issues are.
It has tools to restrict resource usage, to avoid issues with the platform. There are always balances. But the tools we have to resolve bugs are the tools everyone is using to track them, and every member of the team is in the trenches, writing code, solving bugs.
Final question: Personally, what is the most exciting part of this work?
I like to invent things. Especially when I can invent something that provides something new to the open source community. I guess just contributing back to the open source community makes me feel like I've added something to the open source world, and to users ́ experience. And to the whole AI/ML platform.
Thanks Juana!
Thank you.
Get Started with Open Data Hub Today
You can get started with Open Data Hub with the documentation and quick installation guide. The Open Data Hub Operator can be installed from the community operators in the Catalog provided with Openshift 4.x installation. You can also join our open source community and contribute issues and components using our GitLab repo. For the latest tutorials and videos please visit our Openshift AI/ML youtube channel and our Openshift AI/ML information page.
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.