Share on facebook
Share on twitter
Share on linkedin
Share on pinterest

Performance Measure for Classification(Part-2)


In the last part, we see what a confusion matrix is and some other metrics like TPR, TNR, FPR, and FNR, which are based on the confusion matrix.

In this post, we will see about AUC – ROC curves and log loss. We will also learn how to interpret the results from AUC and ROC curves.

For better understanding, I highly recommend you to read my post about the Confusion Matrix before continuing.


It is a graph used to evaluate the performance of a binary classifier. The ROC (Receiver Operating Characteristic) curve plot False Positive Rate (FPR) in the x-axis against the True Positive Rate(TPR) in the y-axis for different thresholds.

Let’s look at the two parameters which are used to plot the curve.


Out of all the actual positive points, how many of them are predicted to be positive.

It is calculated by true positive divided by the total number of positives.



It is calculated as the number of incorrect negative predictions divided by the total number of positives.



Let’s assume that our classification model will give probability scores instead of actual class labels.

The higher the score more the chance it belongs to class 1

For example, let’s take these data

Test points actual_class probability_score
x1 1 0.97
x2 0 0.24
x3 1 0.62
x4 1 0.82
x5 0 0.17

Now we have to sort the probability_score in the descending order

Test points actual_class probability_score
x1 1 0.97
x4 1 0.82
x3 1 0.62
x2 0 0.24
x5 0 0.17

Next, a threshold is used to split all the points into two groups 0 or 1.

Let’s first choose the threshold(T1) as 0.97, which is the largest in the probability_score.

Now we will say that if the probability >= threshold(T1), then it belongs to class 1. All the remaining belongs to class 0.

This is how the predicted class will look like given T1= 0.97

Test points actual_class probability_score predicted(T1=0.97)
x1 1 0.97 1
x4 1 0.82 0
x3 1 0.62 0
x2 0 0.24 0
x5 0 0.17 0

Since we have actual and predicted classes, Now we have to calculate TPR and FPR corresponding to T1=0.97

If the next threshold(T2) is 0.82, then our predicted classes will look like this

Test points actual_class probability_score predicted(T2=0.82)
x1 1 0.97 1
x4 1 0.82 1
x3 1 0.62 0
x2 0 0.24 0
x5 0 0.17 0

Again for T2=0.82, we have to calculate TPR and FPR

We have to repeat this for different threshold values like 0.62, 0.24 etcetera.

Once we calculated all the TPR and FPR values for different thresholds, then we can plot the graph.

If we plot FPR against TPR values, the graph will look like this.



Assume we have a random classification model which will predict the classes randomly.

The ROC for a random classifier will look like a diagonal straight line.


The ROC for a random classifier will separate the area into two parts, and the area under this line is 0.5.

Anything above this line is a good model

If our curve is below this line, then it means that our model is performing worse than the random model.

The ROC of a good model will always be further to the top left corner from this line.


AUC measures the area underneath the ROC curve.

auc roc

The shaded region is called the area under the ROC curve (AUC).

It ranges from 0 to 1. A perfect model will have an AUC of 1, and the worst possible value is 0

If we get a value below 0.5 which is worse than a random model, then we need to change the class label of our model opposite to what it is predicting to get back the value above 0.5

If our model predicts a class to be 0, then change it to 1 and vice versa.

By doing this, we are calculating 1 – AUC.

If our model’s AUC is 0.3, then after swapping the class label it will become 1 – 0.3 = 0.7, which is better than a random model.


Another important metric for classification is log-loss. This metric is based on probability scores.

Log loss will penalize if there is any deviation in the probability score from their expected value.

It ranges from 0 to ∞. Best case is 0. The goal of our model is to minimize this value.

We want our log loss to be as small as possible. Log loss is calculated as

log loss

where yi is the actual class label and pi is the probability that it belongs to a class.

For multiclass classification, log loss is defined as

log loss

Where N is the number of points in the test set, and M is the number of classes.


In this post, you learned about AUC – ROC (Area Under the ROC) curve and log-loss, which are essential metrics in classification based on probability scores.

LOG LOSS: This metric uses the probability scores. It will penalize for false classifications.

Since it doesn’t have an upper bound in some cases, it is difficult to interpret the results.

AUC – ROC CURVE:  Area Under the ROC curve is mainly used in binary classifications. A rational model will always have AUC higher than 0.5

If the model’s AUC is less than 0.5, then we can flip the binary decision of the model to get back the value greater than 0.5

Love What you Read. Subscribe to our Newsletter.

Stay up to date! We’ll send the content straight to your inbox, once a week. We promise not to spam you.

Subscribe Now! We'll keep you updated.