Artificial intelligence (AI) and machine learning (ML) drive much of the world around us, from the apps on our phones to electric cars on the highway. Allowing such things to run as accurately as possible takes huge amounts of data to be collected and understood. At the helm of that critical information are data scientists. So, what’s a day on the job look like for data scientists at Red Hat?
Don Chesworth, Principal Data Scientist, gives you a glimpse into his day-to-day in a short video (aptly named “A Day in the Life of a Red Hat Data Scientist”) that’s now available on our website. Isabel Zimmerman, Data Science Intern, provides a look at some of the tools she uses on the job in “Using Open Data Hub as a Red Hat Data Scientist.” We’ll cover some of the highlights in this post.
Data scientists turn data into business insights
It’s been nearly a decade since Harvard Business Analytics identified data science as one of the hottest jobs of the 21st century, and the technology supporting people in this role has come a long way. Data scientists not only had to come to the table with an innate curiosity, but they had to also “fashion their own tools” to analyze data and visualize it for stakeholders.
Today, tools available in the Open Data Hub and Red Hat OpenShift help data experts focus on understanding and analyzing data instead of managing infrastructure.
Zimmerman explains that a data scientist isn't just someone who trains models, they also turn data into business insights. “Businesses don't have a one-size-fits-all method for machine learning systems,” she says.
“A well architected model may be useful for getting insights into data, but oftentimes in order to gain business value, models have to be deployed as part of a larger intelligent application that's constantly learning from data and making inferences on dynamic data streams.”
Data scientists can find a one-stop, end-to-end platform with Open Data Hub
Open Data Hub is an AI/ML platform that brings together different open source AI tools into a one-stop install. The click of a button starts Red Hat OpenShift with the Open Data Hub Operator already installed.
Within the platform, data scientists can create models using Jupyter Notebooks and select from popular tools like Apache Spark for developing models. While the data science workflow normally ends when the model is built and validated, it's still important to monitor the model to make sure that it stays healthy. Prometheus, another tool available in Open Data Hub, forwards the data to Grafana so data scientists can build dashboards to keep an eye on the model’s health and performance.
In her video, Zimmerman demonstrates how to build, deploy and monitor ML models using Open Data Hub. Open Data Hub can also host the model outside of the Jupyter Notebook for easy access for both the data scientist and the rest of the team, which will include software engineers or front end developers.
The tools available on Open Data Hub help data scientists like Zimmerman deploy models without having to be a front end developer or having to start a data science workflow with a model deployed through the solid operator. From data ingestion to model creation, testing, and visualization, Open Data Hub makes it easier for data scientists to do their jobs.
Open Data Hub also provides data scientists an opportunity to contribute upstream
Since the platform is open source, anybody can contribute code. Chesworth notes that what’s exciting about being a data scientist at Red Hat are “things like contributing code upstream and focusing on the hybrid and containerized in your code is highly encouraged.”
He has a recommender system and containerized that code. It's portable and can be run on his local machine, on a bare metal server, on the cloud, and on Red Hat OpenShift. He also runs it with Open Data Hub.
His code is set up in a way that it can use a CPU, a GPU or multi GPUs. Chesworth noticed that in containerizing ML and distributing, containers are built to be nimble. But because of that, there's very little shared memory space on a container. “You have to jump through quite a few hoops to increase that shared memory size,” he says.
Working with the Open Data Hub team, he submitted improvements for changing Red Hat OpenShift shared memory size across multiple GPUs. Chesworth explains, “I worked with the Open Data Hub team, and they contributed upstream to CRI-O and made a change to make it a lot easier to change your shared memory size. That change went into CRI-O 1.20, which then went into Kubernetes 1.20.”
As an open source company, many Red Hatters work to support and contribute to community projects like the Open Data Hub, which lays the foundation for our internal data science and AI platform.
A day in the life, and more
Time is valuable for data scientists. Tools available through the Open Data Hub help them do data science without also balancing the role of cloud architect or front end developer. This can open more time to solve critical business needs.
“The Open Data Hub simplifies the end-to-end machine learning workflow, and gives me the tools I need to put my model into production,” says Zimmerman.
To learn more about what a Red Hat data scientist does, we invite you to check out these two recently released videos. From AI/ML to containers, there’s even more to discover from our subject matter experts. Just stop by the Red Hat video library and have a look, and be sure to subscribe to the Red Hat channel on YouTube for more!
À propos de l'auteur
As the Managing Editor of the Red Hat Blog, Thanh Wong works with technical subject matter experts to develop and edit content for publication. She is fascinated with learning about new technologies and processes, and she's vested in sharing how they can help solve problems for enterprise environments. Outside of Red Hat, Wong hears a lot about the command line from her system administrator husband. Together, they're raising a young daughter and live in Maryland.
Contenu similaire
Parcourir par canal
Automatisation
Les dernières nouveautés en matière d'automatisation informatique pour les technologies, les équipes et les environnements
Intelligence artificielle
Actualité sur les plateformes qui permettent aux clients d'exécuter des charges de travail d'IA sur tout type d'environnement
Cloud hybride ouvert
Découvrez comment créer un avenir flexible grâce au cloud hybride
Sécurité
Les dernières actualités sur la façon dont nous réduisons les risques dans tous les environnements et technologies
Edge computing
Actualité sur les plateformes qui simplifient les opérations en périphérie
Infrastructure
Les dernières nouveautés sur la plateforme Linux d'entreprise leader au monde
Applications
À l’intérieur de nos solutions aux défis d’application les plus difficiles
Programmes originaux
Histoires passionnantes de créateurs et de leaders de technologies d'entreprise
Produits
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Services cloud
- Voir tous les produits
Outils
- Formation et certification
- Mon compte
- Assistance client
- Ressources développeurs
- Rechercher un partenaire
- Red Hat Ecosystem Catalog
- Calculateur de valeur Red Hat
- Documentation
Essayer, acheter et vendre
Communication
- Contacter le service commercial
- Contactez notre service clientèle
- Contacter le service de formation
- Réseaux sociaux
À propos de Red Hat
Premier éditeur mondial de solutions Open Source pour les entreprises, nous fournissons des technologies Linux, cloud, de conteneurs et Kubernetes. Nous proposons des solutions stables qui aident les entreprises à jongler avec les divers environnements et plateformes, du cœur du datacenter à la périphérie du réseau.
Sélectionner une langue
Red Hat legal and privacy links
- À propos de Red Hat
- Carrières
- Événements
- Bureaux
- Contacter Red Hat
- Lire le blog Red Hat
- Diversité, équité et inclusion
- Cool Stuff Store
- Red Hat Summit