Red Hat and Intel are responding to the industry need for an open source, cloud-based platform optimized for data science operations. Because the platform is built with open source components, it can track the innovation occurring in the community for AI projects. Intel’s AI strategy is centered on openness and giving customers choice in the hardware that best meets their broad applications requirements. However, Intel also realizes that AI is primarily a software problem that requires a software solution. That is why oneAPI-powered Intel AI solutions (i.e., the Intel® AI Analytics and OpenVINO™ toolkits) are integrated into Red Hat OpenShift Data Science, with plans to integrate cnvrg.io, and Habana Gaudi in the future .
Data scientists and developers are always on the lookout for new tools and technologies that make their jobs easier. Red Hat and Intel’s joint offering does just that – Red Hat OpenShift Data Science and Intel® AI tools leverage Red Hat OpenShift to help organizations make the most out of their data—curating and ingesting it, creating models, and deploying them into production—utilizing business processes for data governance, quality assessment, and integration.
What is AI/ML on Intel?
Intel's approach to artificial intelligence and machine learning (AI/ML) is driven by three principles which support the future of AI:
- Developing intelligent systems
- Optimizing hardware and software resources
- Collaborating with industry-leading partners to offer development platforms
Intel teamed up with Red Hat to integrate its AI portfolio with Red Hat OpenShift Data Science. It is available on AWS Cloud, running on Intel hardware and software, including integration with the Intel AI Analytics Toolkit powered by oneAPI. The toolkit includes essential tools for analyzing, visualizing, and optimizing data sets for machine and deep learning workloads.
Red Hat OpenShift Data Science combines into one common platform what self-service data scientists and developers want with the confidence enterprise IT demands. It provides a set of widely used open source data science tools that can be used to build intelligent applications, enabling developers to take advantage of the latest Intel technologies and build data science applications.
Optimizing Red Hat OpenShift Data Science for Intel technologies
The platform is built on widely used open source AI frameworks—JupyterLab, PyTorch, TensorFlow, and more—and integrates with a core set of Intel technologies such as the aforementioned AI Analytics Toolkit, plus the OpenVINO toolkit, cnvrg.io, and Habana Gaudi using Amazon EC2 DL1 instances (cnvrg.io and Habana to be available later).
Collectively, this makes it easier for data scientists to quickly get started without having to worry about managing the underlying infrastructure. Red Hat OpenShift Data Science eliminates complex Kubernetes setup tasks. The platform includes support for a full-featured, managed or self-managed Red Hat OpenShift environment and is ready for rapid development, training, and testing.
Let’s look at the key details of each Intel technology and how they work with Red Hat OpenShift Data Science.
Intel AI Analytics Toolkit accelerates end-to-end machine learning and data analytics pipelines with frameworks and libraries optimized for Intel architectures, including:
- Intel® Distribution for Python, a version of the popular Python framework which provides drop-in performance enhancements for your existing Python code with minimal code changes
- Intel® Optimizations for TensorFlow and PyTorch to accelerate DL training and inference
- Model compression for DL inference with the Intel® Neural Compressor
- Model Zoo for Intel® Architecture for pre-trained popular DL models to run on Intel® Xeon® Scalable processors and DL reference models on Habana GitHub
- Optimizations for CPU- and multiple core-intensive packages with pandas and Intel-optimized versions of Scikit-learn and XGBoost and distributed Dataframe processing in Intel® Distribution of Modin
OpenVINO toolkit accelerates edge-to-cloud, high-performance model inference including:
- Support for multiple deep learning frameworks—TensorFlow, Caffe, PyTorch, MXNet, Keras, ONNX, and more
- Applicability across several DL tasks such as computer vision, speech recognition, and natural language processing
- Easy deployment of model server at scale in Red Hat OpenShift
- Support for multiple storage options (S3, Azure Blob, GSC, local)
- Configurable resource restrictions and security context with Red Hat OpenShift resource requirements
- Quantization, filter pruning, and binarization to compress models
- Configurable service options depending on infrastructure requirements
Cnvrg.io will extend Red Hat OpenShift Data Science enterprise-grade MLOps capabilities with out-of-the-box, end- to-end MLOps tooling, including:
- Advanced MLOps platform to automate the continuous training and deployment of AI and ML models
- Management of the entire lifecycle: data preprocessing, experimentation, training, testing, versioning, deployment, monitoring, and automatic retraining
- Enablement to train and deploy on any infrastructure at scale
- Managed Kubernetes deployment on any cloud or on-premises environment
- Open and flexible data science platform, which integrates any open source tool
Habana Gaudi DL1 instances for DL workloads will be available through the Red Hat OpenShift Data Science platform. Gaudi is designed to accelerate model delivery, reduce time-to-train and cost-to-train, and facilitate building new or migrating existing models to Gaudi solutions, as well as deploying them in production environments. Gaudi benefits include:
- Easy access to Gaudi-based Amazon EC2 DL1 training instances from Red Hat OpenShift Data Science
- Reduction in Gaudi hardware accelerators aim to reduce in total cost of ownership (TCO) with competitive price/performance ratio
- Streamlined training and deployment for data scientists and developers with Habana GitHub and Habana SynapseAI software stack featuring integrated TensorFlow and PyTorch frameworks, documentation, tools, support, reference models, and developer forum
Red Hat OpenShift Data Science Benefits for Data Scientists & Development Teams
Red Hat OpenShift Data Science provides the ability to build, train, deploy and monitor models on premise, in the public cloud or at the edge providing the ultimate flexibility to how an organization is building intelligent applications.
Red Hat updates the platform and integrated AI tooling like Jupyter Notebooks, PyTorch, and TensorFlow libraries. In Red Hat OpenShift Data Science, Kubernetes operators validate security provisions and automate management of components in the container stack, helping to avoid downtime and minimizing manual maintenance tasks.
Using the model serving and monitoring tools built into Red Hat OpenShift Data Science, models are container-ready, which makes it easier to integrate them into an intelligent app. Models can be rebuilt and redeployed as part of a continuous integration/continuous development process based on changes to the source notebook.
Data scientists and developers can also harness the power of hardware acceleration for high-performance AI workloads. Intel AI tools and solutions unlock high-performance training and inference with the power of hardware acceleration via optimized, low-level libraries such as TensorFlow and PyTorch. Using the OpenVINO toolkit, developers can deploy performant inference solutions for Intel XPUs including various types of CPUs, GPUs, and special DL inference accelerators. With the best-in-class AI tools, developers can improve productivity and deployment portability. Developers can scale code across multiple Intel architectures using aforementioned tools—all powered by oneAPI—without code changes.
Red Hat OpenShift Data Science is an AI platform provided on Red Hat OpenShift that is integrated with the latest Intel technologies to allow data scientists and application developers to quickly build and deploy intelligent applications across the hybrid cloud.
To learn more about developer resources from Intel and Red Hat, visit this webpage and Intel’s Booth #864 at Red Hat Summit 2023!
About the author
Dr. Deb Bharadwaj is the Director of AI/ML Product Management at Intel. He has over nineteen years of experience in software product development, scale and growth management, and artificial intelligence/machine learning. He is a seasoned expert in the field of artificial intelligence/machine learning, with a passion for driving innovation through accessible platforms and growth in the AI/ML industry.