Machine Learning is a tool that is quickly becoming more and more available to enterprises. Although this new tool is very powerful, it is still often not well understood. This blog post intends to demystify the concepts around Machine Learning, define much of the vernacular common to the practice, and inform how Red Hat teams can help today. This post extends the information provided in our whiteboarding video.
Disciplines of Artificial Intelligence
Machine Learning is a general name given to an extensive collection of algorithms that are used to allow a computer to identify patterns in past data and use those patterns to predict future results. Computer Vision and Natural Language Processing both make heavy use of Machine Learning.
The field of Machine Learning is considered to be a subset of a field called Artificial Intelligence. Often, the terms concatenated together as AI/ML, but it is important to understand the difference. Artificial Intelligence describes a computer making decisions just as an educated human might. In Artificial Intelligence, the computer is given instructions on how to make a decision rather than finding the rules from data. Business Rules and Process Automation are great examples of Artificial Intelligence that are not considered to be Machine Learning.
Within the past decade, the industry has seen a boom in the discipline of Deep Learning. This is a subset of Machine Learning that employs the use of large Artificial Neural Networks (ANN) to identify patterns in data. ANNs are algorithms designed to work similarly to the human brain. Deep Learning has the advantage of requiring less data preparation than traditional machine learning algorithms, but it also requires substantially more compute resources in return. Often, Machine Learning is assumed to be primarily Deep Learning. However, many algorithms exist that do not make use of ANNs and are not considered to be Deep Learning. Choosing the best algorithm for the problem at hand is the chief responsibility of the data scientist.
Styles of Machine Learning
Machine learning is typically classified by three learning styles: Supervised Learning, Unsupervised Learning, and Reinforcement Learning.
Supervised Learning is a style of learning where the computer is trained to map known inputs to known outputs based on known examples. For this style of learning to work, there must be good examples to give to the computer for it to learn. For this example set of data, called a training set, the answer must already be known. The process of attaching answers to the learning data set is called labeling the data. This learning can be used to perform two types of analysis: Regression and Classification. A regression problem is one where the output is a continuous number, like a home’s market value. A classification problem is one where the answer is one or many categories, like identifying objects in pictures. Supervised Learning is a fairly well-understood discipline, and many pre-trained algorithms exist. However, Supervised Learning does require clean and labeled data.
Unsupervised Learning is a style of learning where the computer is trained to map known inputs to unknown outputs simply by recognizing patterns in the data. A training set of data is given to the algorithm, but the answers are not already known. Common analyses performed with Unsupervised Learning are Clustering and Anomaly Detection (although others exist). Clustering involves looking for data points that are close to each other and are easily grouped. Anomaly Detection involves recognizing common traits so that rare data points are identified. Unsupervised learning does not require training data to be labeled and does not require the data scientist to have deep domain knowledge in the data. However, these algorithms are more difficult in practice and require additional analysis to ensure the results are meaningful.
Reinforcement Learning is a style of learning where a computer is trained to take a series of actions to maximize a reward. This style of learning is often used when training a computer to play a game such as Chess or Go. With these algorithms, the computer can play the game and recognize which actions lead to a win or a loss. These algorithms provide the benefit that they can produce very finely tuned results. However, data scientists can only use these algorithms when the problem has some notion of a reward, and scientists can simulate the problem space. The computer will need to run through many simulations of the problem to learn realistic outcomes.
Machine Learning at Red Hat
Red Hat has employed data scientists for quite some time. One product of Red Hat’s Data Science Team is Red Hat Insights. This tool monitors Red Hat Enterprise Linux systems and identifies hardware or configuration issues before they manifest as outages. This tool is publicly available now and provided with every RHEL subscription.
Internally, Red Hat’s data scientists work primarily on OpenShift. Data science environments typically depend on a specific permutation of software dependencies and software versions that are easily standardized and versioned with containers. OpenShift is perfectly suited to provide compute resources to data scientists in a flexible and on-demand way that has not previously been possible in compute-intensive environments. Additionally, OpenShift allows data scientists to move their trained models into production with ease. The infrastructure used internally at Red Hat for data science has been open-sourced as the Open Data Hub project. This is publically available as a reference architecture to any enterprise looking to bootstrap their data science platform.
Sull'autore
Ricerca per canale
Automazione
Novità sull'automazione IT di tecnologie, team e ambienti
Intelligenza artificiale
Aggiornamenti sulle piattaforme che consentono alle aziende di eseguire carichi di lavoro IA ovunque
Hybrid cloud open source
Scopri come affrontare il futuro in modo più agile grazie al cloud ibrido
Sicurezza
Le ultime novità sulle nostre soluzioni per ridurre i rischi nelle tecnologie e negli ambienti
Edge computing
Aggiornamenti sulle piattaforme che semplificano l'operatività edge
Infrastruttura
Le ultime novità sulla piattaforma Linux aziendale leader a livello mondiale
Applicazioni
Approfondimenti sulle nostre soluzioni alle sfide applicative più difficili
Serie originali
Raccontiamo le interessanti storie di leader e creatori di tecnologie pensate per le aziende
Prodotti
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Servizi cloud
- Scopri tutti i prodotti
Strumenti
- Formazione e certificazioni
- Il mio account
- Supporto clienti
- Risorse per sviluppatori
- Trova un partner
- Red Hat Ecosystem Catalog
- Calcola il valore delle soluzioni Red Hat
- Documentazione
Prova, acquista, vendi
Comunica
- Contatta l'ufficio vendite
- Contatta l'assistenza clienti
- Contatta un esperto della formazione
- Social media
Informazioni su Red Hat
Red Hat è leader mondiale nella fornitura di soluzioni open source per le aziende, tra cui Linux, Kubernetes, container e soluzioni cloud. Le nostre soluzioni open source, rese sicure per un uso aziendale, consentono di operare su più piattaforme e ambienti, dal datacenter centrale all'edge della rete.
Seleziona la tua lingua
Red Hat legal and privacy links
- Informazioni su Red Hat
- Opportunità di lavoro
- Eventi
- Sedi
- Contattaci
- Blog di Red Hat
- Diversità, equità e inclusione
- Cool Stuff Store
- Red Hat Summit