Share on facebook
Share on twitter
Share on linkedin
Share on pinterest

Introduction to T-SNE with implementation in python

INTRODUCTION to T – SNE:

T-SNE is a non-linear dimensionality reduction technique used to visualize high-dimensional data in two or more dimensions.

Unlike PCA which preserves only the global structure of the data T-SNE preserves both the local and global structure.

It uses the local relationship between data to map the high-dimensional data in two dimensions.

Since T-SNE is just an advancement of Stochastic Neighbor Embedding(SNE), let’s first try to understand what SNE is?

STOCHASTIC NEIGHBOR EMBEDDING:

Stochastic neighbor embedding is a probabilistic approach to visualize high-dimensional data. It converts high dimensional Euclidean distances between points into conditional probabilities.

SNE makes an assumption that the distances in both the high and low dimension are Gaussian distributed.

The conditional probability pj|i in the high dimension that a data point xi would pick xj as its neighbor is given by

SNE FORMULA

The conditional probability qj|i for the low-dimension is given by

SNE FORMULA

The variance σi is associated with the parameter called perplexity which is a measure of the number of neighbors. SNE performs a binary search to find the value σi which produces Pi with a fixed perplexity specified by the user.

Perplexity is defined as,

PERPLEXITY FORMULA

where H(Pi) is the Shannon entropy of Pi measured in bits

SHANNON ENTROPY FORMULA

Since we are calculating the pairwise similarities, the probability that a point will choose itself as a neighbor is zero.

That is the values of Pi|i and Qj|j are equal to zero

Once the probabilities are constructed in both the high and low dimensions, the algorithm aims to reduce the differences between Pj|i and

Qj|i so that the low dimensional mapping will be similar to the high dimensional one.

This is achieved by minimizing a cost function which is a sum of Kullback-Leibler divergences. The cost function C is given by,

Kullback-Leibler formula

The minimization of the cost function is done using the gradient descent method.

Since the KL divergence is not symmetric, it emphasizes preserving the local structure.

It preserves the local structure by penalizing heavily when the points which are near in the high dimension are mapped far away in the low dimension, but the penalty is small for mapping widely separated data points close together.

DRAWBACKS OF SNE:

There are two significant drawbacks in Stochastic Neighbor Embedding.

1. The cost function used is difficult to optimize.

2. Crowding problem, where the moderately-distant data points and the points which are nearby are clumped together to fit in the 2-dimensional space.

T-SNE:

As the cost function used in SNE is difficult to optimize T-SNE uses a cost function which is a symmetric version of SNE, i.e., pi|j = pj|i and qi|j = qj|i

The cost function is given by

t-sne cost function

with pii = qii = 0

Here, the pairwise similarities in the high dimensional space are defined by

t-sne cost function

For low dimensional mapping, instead of a normal distribution, we will use a heavy-tailed student t-distribution with one degree of freedom.

This heavy-tailed distribution helps to alleviate the crowding problem.

Using this distribution, low dimensional representation is given by,

heavy tailed t-distribution

Let’s implement T-SNE using scikit-learn

T-SNE

As you can see in the above plot, TSNE constructs an excellent visualization. The cluster of classes are well separated, and there are fewer overlaps.

SUMMARY:

In this post, I have discussed one of the powerful dimensionality reduction technique called t-SNE which has limitless applications.

In the beginning, we discussed Stochastic Neighbor Embedding(SNE) which is a predecessor of T-SNE.

We also discussed the drawbacks of SNE and how T-SNE overcomes it. Finally, we implement T-SNE using sklearn.

I hope this post has given you a better understanding of how tsne works and for those who want to delve deep into this topic I suggest you to read the original paper and this blog post.

Love What you Read. Subscribe to our Newsletter.

Stay up to date! We’ll send the content straight to your inbox, once a week. We promise not to spam you.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe Now! We'll keep you updated.