Unsupervised learning is a type of machine learning in which the algorithm is not provided with any pre-assigned labels or scores for the training data. As a result, unsupervised learning algorithms must first self-discover any naturally occurring patterns in that training data set. Common examples include clustering, where the algorithm automatically groups its training examples into categories with similar features, and principal component analysis, where the algorithm finds ways to compress the training data set by identifying which features are most useful for discriminating between different training examples, and discarding the rest. This contrasts with supervised learning in which the training data include pre-assigned category labels (often by a human, or from the output of non-learning classification algorithm). Other intermediate levels in the supervision spectrum include reinforcement learning, where only numerical scores are available for each training example instead of detailed tags, and semi-supervised learning where only a portion of the training data have been tagged.
Advantages of unsupervised learning include a minimal workload to prepare and audit the training set, in contrast to supervised learning techniques where a considerable amount of expert human labor is required to assign and verify the initial tags, and greater freedom to identify and exploit previously undetected patterns that may not have been noticed by the “experts”. This often comes at the cost of unsupervised techniques requiring a greater amount of training data and converging more slowly to acceptable performance, increased computational and storage requirements during the exploratory process, and potentially greater susceptibility to artifacts or anomalies in the training data that might be obviously irrelevant or recognized as erroneous by a human, but are assigned undue importance by the unsupervised learning algorithm.
Approaches
Common families of algorithms used in unsupervised learning include:
(1) Clustering
(2) Anomaly detection
(3) Neural networks (note that not all neural networks are unsupervised; they can be trained by supervised, unsupervised, semi-supervised, or reinforcement methods)
(4) Latent variable models.
- Clustering methods include hierarchical clustering, k-means, mixture models, DBSCAN, and OPTICS algorithm
- Anomaly detection methods include Local Outlier Factor, and Isolation Forest
- Neural network methods include autoencoders, deep belief networks, Hebbian learning, generative adversarial networks (GANs), and self-organizing maps
- Approaches for learning latent variable models include expectation maximization algorithm, the method of moments, and blind signal separation techniques (principal component analysis, independent component analysis, non-negative matrix factorization, singular value decomposition)
Method of moments
One statistical approach for unsupervised learning is the method of moments. In the method of moments, the unknown parameters of interest in the model are related to the moments of one or more random variables. These moments are empirically estimated from the available data samples and used to calculate the most likely value distributions for each parameter. The method of moments is shown to be effective in learning the parameters of latent variable models, where in addition to the observed variables available in the training and input data sets, a number of unobserved latent variables are also assumed to exist and to determine the categorization of each same. One practical example of latent variable models in machine learning is topic modeling, which is a statistical model for predicting the words (observed variables) in a document based on the topic (latent variable) of the document. The method of moments (tensor decomposition techniques) has been shown to consistently recover the parameters of a large class of latent variable models under certain assumptions.
The expectation–maximization algorithm is another practical method for learning latent variable models. However, it can get stuck in local optima, and it is not guaranteed to converge to the true unknown parameters of the model. In contrast, using the method of moments, global convergence is guaranteed under some conditions.
Types of Unsupervised Learning Algorithm:
Clustering: Clustering is a method of grouping the objects into clusters such that objects with most similarities remains into a group and has less or no similarities with the objects of another group. Cluster analysis finds the commonalities between the data objects and categorizes them as per the presence and absence of those commonalities.
Association: An association rule is an unsupervised learning method which is used for finding the relationships between variables in the large database. It determines the set of items that occurs together in the dataset. Association rule makes marketing strategy more effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.
Advantages
- Unsupervised learning is used for more complex tasks as compared to supervised learning because, in unsupervised learning, we don’t have labelled input data.
- Unsupervised learning is preferable as it is easy to get unlabelled data in comparison to labelled data.
Disadvantages
- Unsupervised learning is intrinsically more difficult than supervised learning as it does not have corresponding output.
- The result of the unsupervised learning algorithm might be less accurate as input data is not labelled, and algorithms do not know the exact output in advance.
Unsupervised Learning algorithms:
List of some popular unsupervised learning algorithms:
- K-means clustering
- KNN (k-nearest neighbors)
- Hierarchal clustering
- Anomaly detection
- Neural Networks
- Principle Component Analysis
- Independent Component Analysis
- Apriori algorithm
- Singular value decomposition