Machine learning: Evaluation Metrics for Classification

A. Binary Classification

Confusion Matrix

The confusion matrix is a table with the number of correct and incorrect predictions broken down by class

True Positive: You predicted Positive and it's True

True Negative: You predicted Negative and it's True

False Positive (Type 1 Error): You predicted Positive and it's False

False Negative (Type 2 Error): You predicted Negative and it's False

We describe predicted values as Positive and Negative and actual values as True and False.

Accuracy

Accuracy is the ratio between the number of correct predictions to total samples

Accuracy should be high as possible

Precision

From all samples we have predicted as positive, how many are actually positive.

Precision should be high as possible.

Recall/Sensitivity (True Positive Rate)

From all the Positive samples, how many we predicted correctly.

Recall should be high as possible.

F1 Score

It is a combination of Precision and Accuracy

Best value reaches to 1 when Precision = Recall = 100%

Worst value = 0

It is difficult to compare two models with low precision and high recall or vice versa. So to make them comparable, we use F-score.

ROC Curve

A receiver operating characteristic (ROC) curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied

Area Under Curve (AUC) of ROC

Provides an aggregate measure of performance across all positive classification thresholds

B. Multi-Class Classification

Confusion Matrix

Accuracy

Machine learning

Menu bar

06/08/2021

Evaluation Metrics for Classification

Accuracy

Precision

Recall/Sensitivity (True Positive Rate)

F1 Score

ROC Curve

No comments:

Post a Comment