In machine learning, its algorithms are mainly divided into supervised learning and unsupervised learning. So what do we need to distinguish and pay attention to between these two types?
一、Supervised Learning
Basic Definition:(Andrew Ng)
In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.
Supervised learning problems are categorized into “regression” and “classification” problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.
In summary, supervised learning is the process of learning training data with labels, identifying the mapping relationship between data and labels (more precisely, a function), and using this mapping relationship to predict unlabeled samples and obtain their labels.
In the field of supervised learning, the two major research branches are:
(1)Regression (2)Classification
key:get input X and also get the result of X that is Y(labeled)
二、Unsupervised Learning
Basic Definition:(wiki)
Unsupervised learning is a method of machine learning that automatically classifies or groups input data without providing pre labeled training examples. The main applications of unsupervised learning include cluster analysis, association rules, and dimensionality reduction. It is an alternative to strategies such as supervised learning and reinforcement learning.
Unsupervised learning is a training method for machine learning, which is essentially a statistical tool that can discover potential structures in unlabeled data.
Unsupervised learning can be divided into two types of problems:
(1)clustering (2)dimensionality reduction.
The clustering problem is to divide data into different groups or clusters, so that the similarity of data within the same group is high, while the similarity between different groups is low. The dimensionality reduction problem is to map high-dimensional data to a low dimensional space to reduce feature dimensions and data complexity.
key:get input X and not get the result of X that is Y(no labels)