Why Unsupervised Learning is Indispensable Part of Artificial Intelligence? Learn Types, Clustering, Applications and Opportunities - What and How?

Abstract:
Unsupervised learning is a machine learning problem type in which training data consists of a set of input vectors but no corresponding target values. The idea behind this type of learning is to group information based on similarities, patterns, and differences.

Unlike in supervised learning problems, unsupervised learning algorithms do not require input-to-output mappings to learn a mapping function—this is what is meant when we say, “no teacher is provided to the learning algorithm.” Consequently, an unsupervised learning algorithm cannot perform classification or regression.  

Keywords:Unsupervised Learning, Clustering, Association rule, Dimensionality Reduction,

Contents 
1. Introduction 
2. AI and Unsupervised Learning
3. Why Unsupervised Learning so Important?
4. Types of Unsupervised Learning
5. Clustering
6. Association Rule
7. Dimensionality Reduction
8. Differences between Supervised Learning and Unsupervised Learning 
9. Practical Examples
10. Conclusions

Let's explore the Unsupervised Learning...

1. Unsupervised Learning: An Introduction
Unsupervised learning is a subfield of machine learning in which a model is trained on unlabeled (or untagged) data. The main idea behind unsupervised learning is for the model to detect hidden insights and patterns in a given data set without having to first identify or classify what to look for.
In unsupervised learning, we are feeding the model data without specifying what output values we want it to produce. This gives the model the freedom to manipulate the data set as it sees fit.

2. How Does an Unsupervised Machine Learning Model Work?
An unsupervised machine learning model works in three stages — collecting the data that's needed, training the model to make sense of the unlabeled data, and then evaluating the model to see how it performs for a given set of inputs.
Let's look at each step in isolation:

Step 1: Collection of Necessary Data
In general, data collected for an unsupervised machine learning model is unstructured as it's in a more raw format. Even though unsupervised data sets are much bigger than labeled or supervised data sets, they are usually cheaper to collect, as they require no specific labeling or processing in order for the data set to be used.

Step 2: Training of the Model
As we'll see in some of the unsupervised machine learning algorithms, unlike supervised algorithms, such algorithms take in unlabeled data and try to make sense of it. This can be done by clustering all data points into given clusters or by discovering hidden patterns and trends.

Step 3: Model Evaluation
To make sure that our model is returning peak accurate results, we must deliberately test the model’s output on different and various input variables. We can then move on to tuning the model’s parameters in order to improve its final result.

4. Why Unsupervised Learning is so important?

With the advent of machine learning and artificial intelligence, machines are getting more and more advanced and their abilities are frequently pushed to the limit.

Nowhere is this more tested than in unsupervised learning, which is a format of learning that a machine uses without any form of training data or guidance.

This form of learning as been more closely associated with true artificial intelligence. 

To better understand unsupervised learning, consider a mother teaching her child to distinguish between two animals: a dog and a cat. The mother shows her child a set number of images of both animals, and the child then defines some characteristics of both animals until he is able to fully classify new images as either species. The child can then categorize dogs as category 1 and cats as category 2. It's worth noting that the mother did not label the images, so the child has no idea which animal is which. Instead, the child made observations about the features of both animals, such as nose shape, tail length, size, and so on, in order to categorize them.

The number one advantage of unsupervised learning is the ability for a machine to tackle problems that humans might find insurmountable either due to a limited capacity or a bias.

Unsupervised learning is ideal for exploring raw and unknown data. It works for a data scientist that does not necessarily know what he or she is looking for.

When presented with data, an unsupervised machine will search for similarities between the data namely images and separate them into individual groups, attaching its own labels onto each group.

This kind of algorithmic behavior is very useful when it comes to segmenting customers as it can easily separate data into groups without any form of bias that might hinder a human due to pre-existing knowledge about the nature of the data on the customers.

3. What's Difference between Supervised vs Unsupervised Learning ?

In the table below, we’ve compared some of the key differences between unsupervised and supervised learning: 
  
4. Types of Unsupervised Learning
In the introduction, we mentioned that unsupervised learning is a method we use to group data when no labels are present. Since no labels are present, unsupervised learning methods are typically applied to build a concise representation of the data so we can derive imaginative content from it. 

For example, if we were releasing a new product, we can use unsupervised learning methods to identify who the target market for the new product will be: this is because there is no historical information about who the target customer is and their demographics. 

But unsupervised learning can be broken down into three main tasks: 

5. Clustering
From a theoretical standpoint, instances within the same group tend to have similar properties. You can observe this phenomenon in the periodic table. Members of the same group, separated by eighteen columns, have the same number of electrons in the outermost shells of their atoms and form bonds of the same type. 

This is the idea that’s at play in clustering algorithms; Clustering methods involve grouping untagged data based on their similarities and differences. When two instances appear in different groups, we can infer they have dissimilar properties. 

5.1. Learning Approach and Types 
Clustering is a popular type of unsupervised learning approach. You can even break it down further into different types of clustering; for example: 

Exlcusive clustering: 
Data is grouped such that a single data point exclusively belongs to one cluster. 
Overlapping clustering: A soft cluster in which a single data point may belong to multiple clusters with varying degrees of membership. 

Hierarchical clustering: 
A type of clustering in which groups are created such that similar instances are within the same group and different objects are in other groups. 
Probalistic clustering: Clusters are created using probability distribution.

6. Association Rule 
This type of unsupervised machine learning takes a rule-based approach to discovering interesting relationships between features in a given dataset. It works by using a measure of interest to identify strong rules found within a dataset. 

We typically see association rule mining used for market basket analysis: this is a data mining technique retailers use to gain a better understanding of customer purchasing patterns based on the relationships between various products. 

The most widely used algorithm for association rule learning is the Apriori algorithm. However, other algorithms are used for this type of unsupervised learning, such as the Eclat and FP-growth algorithms. 

7. Dimensionality Reduction
Popular algorithms used for dimensionality reduction include principal component analysis (PCA) and Singular Value Decomposition (SVD). These algorithms seek to transform data from high-dimensional spaces to low-dimensional spaces without compromising meaningful properties in the original data. These techniques are typically deployed during exploratory data analysis (EDA) or data processing to prepare the data for modeling.

It’s helpful to reduce the dimensionality of a dataset during EDA to help visualize data: this is because visualizing data in more than three dimensions is difficult. From a data processing perspective, reducing the dimensionality of the data simplifies the modeling problem.

When more input features are being fed into the model, the model must learn a more complex approximation function. This phenomenon can be summed up by a saying called the “curse of dimensionality.” 

8. Unsupervised Learning Applications
Most executives would have no problem identifying use cases for supervised machine learning tasks; the same cannot be said for unsupervised learning. 

One reason this may be is down to the simple nature of risk. Unsupervised learning introduces much more risk than unsupervised learning since there’s no clear way to measure results against ground truth in an offline manner, and it may be too risky to conduct an online evaluation. 

Nonetheless, there are several valuable unsupervised learning use cases at the enterprise level. Beyond using unsupervised techniques to explore data, some common use cases in the real-world include: 

9. Natural language processing (NLP). 
Google News is known to leverage unsupervised learning to categorize articles based on the same story from various news outlets. For instance, the results of the football transfer window can all be categorized under football.
Image and video analysis. Visual Perception tasks such as object recognition leverage unsupervised learning.
Anomaly detection. Unsupervised learning is used to identify data points, events, and/or observations that deviate from a dataset's normal behavior.

10. Customer segmentation. Interesting buyer persona profiles can be created using unsupervised learning. This helps businesses to understand their customers' common traits and purchasing habits, thus, enabling them to align their products more accordingly.
Recommendation Engines. Past purchase behavior coupled with unsupervised learning can be used to help businesses discover data trends that they could use to develop effective cross-selling strategies.
Unsupervised Learning 

11. Examplesof unsupervised learning in Python
Some examples of unsupervised learning algorithms include 

K-Means Clustering
K-Means Clustering is an unsupervised learning algorithm that is used to solve the clustering problems in machine learning or data science. In this topic, we will learn what is K-means clustering algorithm, how the algorithm works, along with the Python implementation of k-means clustering.

Principal Component Analysis
Principal component analysis (PCA) is the process of computing the principal components then using them to perform a change of basis on the data. 

# In other words, PCA is an unsupervised learning dimensionality reduction technique. 
# It’s useful to reduce the dimensionality of a dataset for two main reasons: 
#When there are too many dimensions in a dataset to visualize 
#To identify the most predictive n dimensions for feature selection when building a predictive model. 

Hierarchical Clustering.

Hierarchical clustering is a connectivity-based clustering model that groups the data points together that are close to each other based on the measure of similarity or distance. The assumption is that data points that are close to each other are more similar or related than data points that are farther apart.

A dendrogram, a tree-like figure produced by hierarchical clustering, depicts the hierarchical relationships between groups. Individual data points are located at the bottom of the dendrogram, while the largest clusters, which include all the data points, are located at the top. In order to generate different numbers of clusters, the dendrogram can be sliced at various heights.

The dendrogram is created by iteratively merging or splitting clusters based on a measure of similarity or distance between data points. Clusters are divided or merged repeatedly until all data points are contained within a single cluster, or until the predetermined number of clusters is attained.

We can look at the dendrogram and measure the height at which the branches of the dendrogram form distinct clusters to calculate the ideal number of clusters. The dendrogram can be sliced at this height to determine the number of clusters.





Comments