Types of Machine Learning Problems

Artificial intelligence has many subfields, one of which is machine learning. We list the key categories of issues it addresses on this page.

Supervised learning

The goal of supervised learning, which aims to learn how to generate predictions from a collection of labeled instances (i.e. examples that are accompanied by the value to be predicted), is to understand machine learning problems. The labels supervise the algorithm's learning by acting as "teachers" and providing feedback.

Definition of Supervised Learning

The area of machine learning known as supervised learning is concerned with issues that can be formalized as follows: Given a function that connects the data in space X to the labels in space Y, n observations described in space X, and their labels described in space Y.

The plan is to identify this function using the data. The database, a significant data set, serves as the beginning point. The algorithm can gain knowledge from this database. However, in supervised learning, the computer is already aware of the predicted responses. It uses labeled data to operate.

Let's take the example of an application designed to automatically recognize spam. To train it, it is presented with e-mails labeled as "desirable" or "spam". Using techniques derived from statistics and probability, the algorithm then understands the characteristics that enable it to classify these e-mails in each of these categories.

As it is presented with new e-mails, it can identify them by assigning a probability score. For example: "This email has a 95% chance of being spam". And his first answers will be corrected by hand so that he can improve as he goes along.

Binary categorization

Binary labels serve as indicators of class membership. The term for this is binary classification.

A binary classification problem is a supervised learning issue where the label space is binary, Y ={0,1}.

Example

Examples of binary categorization issues are shown below:

- Determining whether or whether an email is spam;

- Determining whether or whether a painting was created by Picasso;

- Determining whether or not a giraffe is present in a photograph;

- Determining whether a chemical may be used to treat depression;

- Determining whether a financial transaction is phony.

Regression

Regression is used when real-valued labels are involved.

Regression problems are supervised learning issues where the label space is Y = R.

Example

Regression issues include the following:

- Predicting how many times a link will be clicked;

- Predicting how many people will be using an online service at a certain moment;

- Predicting the price of a stock on the stock market;

- Predicting the binding affinity between two molecules;

- Predicting the yield of a corn plant.

Structured regression

We refer to structured regression if the label space is a more intricately organized space than those previously described. For instance, predicting vectors, pictures, graphs, or sequences may be necessary. Speech recognition and machine translation are only two examples of the many issues that can be formalized using structured regression.

Unsupervised learning

Because the answers we're attempting to anticipate are not present in the dataset, unsupervised learning differs from supervised learning in this regard. Here, an unlabeled dataset is used by the algorithm. The goal is to model the observations so that we can better comprehend them.

The machine is then instructed to generate original responses. It suggests solutions based on the grouping and analysis of data. The following are a few examples of jobs that can be completed using this approach.

Unsupervised learning is the area of artificial intelligence that deals with issues that can be formulated as follows: given n observations {~x i}i=1,...,n described in a space X, the aim is to develop a function on X that validates specific properties.

Clustering

The computer is instructed to divide up the items into as many homogeneous data sets as it can. This method may appear to be comparable to classification in supervised learning, but unlike that approach, the computer "invents" its own classes with a level of sophistication that isn't necessarily clear to a person.

The process of clustering or partitioning entails locating groups within the data. As a result, it is possible to comprehend their general traits and perhaps extrapolate an observation's qualities based on the group to which it belongs.

Therefore, finding a partition of data can be described as an unsupervised learning task known as partitioning or clustering. This partition must be pertinent to one or more of the listed criteria.

Example

A few illustrations of partitioning issues are as follows:

- Market segmentation entails locating groups of consumers or clients who exhibit comparable behaviors. This allows for a deeper understanding of their profile and the ability to particularly target certain demographics with an advertising campaign, content, or action.

- Locate collections of documents that share a common subject without first subject-tagging them. Large banks of texts can now be organized as a result.

- Similar pixels are grouped together to represent them more effectively in the partitioning issue that can be used to develop the concept of image compression.

- Identifying pixels in an image that are part of the same region is known as image segmentation.

- Subtypes of an illness can be found by grouping people who share the same symptoms, and these subtypes can then be treated differently.

Dimension reduction

Another significant family of unsupervised learning problems is dimension reduction. In order to do this, the data must be represented in a space with a lower dimension than the space in which it was initially represented.

This not only cuts down on the amount of time needed for calculation and memory space needed to keep the data, but it also frequently enhances the effectiveness of a supervised learning algorithm that has been trained on this data in the past.

An unsupervised learning problem known as "dimension reduction" involves finding a space Z with fewer dimensions than the space X in which n observations are represented. Certain qualities must be confirmed by the data projections onto Z.

Note:

The goal of several supervised dimension reduction techniques is to identify the most pertinent representation for label prediction.

Density estimation :

Last but not least, a sizable family of unsupervised learning issues is a classic statistical issue: estimating a probability distribution assuming the dataset is a random sample.

Semi-supervised learning

As one might anticipate, semi-supervised learning involves extracting labels from a dataset that has only been partially tagged. The first benefit of this strategy is that it avoids labeling all the training instances, which is important when accumulating data is simple but labeling it involves some manual labor.

Consider the example of picture classification: obtaining a database with hundreds of thousands of images is simple, but assigning each image a label of interest can be quite time-consuming.

Furthermore, labels provided by people will probably reflect their own prejudices, which a fully supervised algorithm will also reflect. This pitfall can occasionally be avoided by semi-supervised learning. This is a more complex topic that we won't cover in this book.

Reinforcement learning

In reinforcement learning, the learning system can interact with its environment and take actions; in return, it receives a reward, which may be positive if the action was a good choice, or negative if it was not.

The reward can occasionally come after a long sequence of actions, as in the case of a system learning to play chess; in this case, learning entails defining a policy, i.e. a strategy for methodically obtaining rewards.

In essence, everything for the user depends on the database he wants the artificial intelligence to operate on and the issue he is trying to find solutions for. Supervised learning is appropriate for him if his database is labeled and he is confident of the categories he wishes to classify his data. He should choose unsupervised learning if his data is not categorized and doing so would be too expensive. He can create autonomous devices with the help of reinforcement learning.

The concept of machine learning

Technology for virtualization

Edit This Article

TechNews for innovation and new technologies

Types of Machine Learning problems

Types of Machine Learning Problems

Supervised learning

Definition of Supervised Learning

Binary categorization

Example

Regression

Example

Structured regression

Unsupervised learning

Clustering

Example

Dimension reduction

Note:

Density estimation :

Semi-supervised learning

Reinforcement learning

The concept of machine learning

Technology for virtualization