피드 구독

Machine Learning is a tool that is quickly becoming more and more available to enterprises. Although this new tool is very powerful, it is still often not well understood. This blog post intends to demystify the concepts around Machine Learning, define much of the vernacular common to the practice, and inform how Red Hat teams can help today. This post extends the information provided in our whiteboarding video.

Disciplines of Artificial Intelligence

Machine Learning is a general name given to an extensive collection of algorithms that are used to allow a computer to identify patterns in past data and use those patterns to predict future results. Computer Vision and Natural Language Processing both make heavy use of Machine Learning. 

The field of Machine Learning is considered to be a subset of a field called Artificial Intelligence. Often, the terms concatenated together as AI/ML, but it is important to understand the difference. Artificial Intelligence describes a computer making decisions just as an educated human might. In Artificial Intelligence, the computer is given instructions on how to make a decision rather than finding the rules from data. Business Rules and Process Automation are great examples of Artificial Intelligence that are not considered to be Machine Learning.

Within the past decade, the industry has seen a boom in the discipline of Deep Learning. This is a subset of Machine Learning that employs the use of large Artificial Neural Networks (ANN) to identify patterns in data. ANNs are algorithms designed to work similarly to the human brain. Deep Learning has the advantage of requiring less data preparation than traditional machine learning algorithms, but it also requires substantially more compute resources in return. Often, Machine Learning is assumed to be primarily Deep Learning. However, many algorithms exist that do not make use of ANNs and are not considered to be Deep Learning. Choosing the best algorithm for the problem at hand is the chief responsibility of the data scientist.

Styles of Machine Learning

Machine learning is typically classified by three learning styles: Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

Supervised Learning is a style of learning where the computer is trained to map known inputs to known outputs based on known examples. For this style of learning to work, there must be good examples to give to the computer for it to learn. For this example set of data, called a training set, the answer must already be known. The process of attaching answers to the learning data set is called labeling the data. This learning can be used to perform two types of analysis: Regression and Classification. A regression problem is one where the output is a continuous number, like a home’s market value. A classification problem is one where the answer is one or many categories, like identifying objects in pictures. Supervised Learning is a fairly well-understood discipline, and many pre-trained algorithms exist. However, Supervised Learning does require clean and labeled data. 

Unsupervised Learning is a style of learning where the computer is trained to map known inputs to unknown outputs simply by recognizing patterns in the data. A training set of data is given to the algorithm, but the answers are not already known. Common analyses performed with Unsupervised Learning are Clustering and Anomaly Detection (although others exist). Clustering involves looking for data points that are close to each other and are easily grouped. Anomaly Detection involves recognizing common traits so that rare data points are identified. Unsupervised learning does not require training data to be labeled and does not require the data scientist to have deep domain knowledge in the data. However, these algorithms are more difficult in practice and require additional analysis to ensure the results are meaningful.

Reinforcement Learning is a style of learning where a computer is trained to take a series of actions to maximize a reward. This style of learning is often used when training a computer to play a game such as Chess or Go. With these algorithms, the computer can play the game and recognize which actions lead to a win or a loss. These algorithms provide the benefit that they can produce very finely tuned results. However, data scientists can only use these algorithms when the problem has some notion of a reward, and scientists can simulate the problem space. The computer will need to run through many simulations of the problem to learn realistic outcomes.

Machine Learning at Red Hat

Red Hat has employed data scientists for quite some time. One product of Red Hat’s Data Science Team is Red Hat Insights. This tool monitors Red Hat Enterprise Linux systems and identifies hardware or configuration issues before they manifest as outages. This tool is publicly available now and provided with every RHEL subscription.

Internally, Red Hat’s data scientists work primarily on OpenShift. Data science environments typically depend on a specific permutation of software dependencies and software versions that are easily standardized and versioned with containers. OpenShift is perfectly suited to provide compute resources to data scientists in a flexible and on-demand way that has not previously been possible in compute-intensive environments. Additionally, OpenShift allows data scientists to move their trained models into production with ease. The infrastructure used internally at Red Hat for data science has been open-sourced as the Open Data Hub project. This is publically available as a reference architecture to any enterprise looking to bootstrap their data science platform. 


저자 소개

UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Original series icon

오리지널 쇼

엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리