Classification metrics
When performing classification predictions, there’s four types of outcomes that could occur.
- True positives are when you predict an observation belongs to a class and it actually does belong to that class.
- True negatives are when you predict an observation does not belong to a class and it actually does not belong to that class.
- False positives occur when you predict an observation belongs to a class when in reality it does not.
- False negatives occur when you predict an observation does not belong to a class when in fact it does.
Accuracy is defined as the percentage of correct predictions for the test data. It can be calculated easily by dividing the number of correct predictions by the number of total predictions.
Accuracy = Correct Predictions /All predictions
Precision is defined as the fraction of relevant examples (true positives) among all of the examples which were predicted to belong in a certain class.
Precision = True positives/ True positives + False positives
Recall is defined as the fraction of examples which were predicted to belong to a class with respect to all of the examples that truly belong in the class.
Recall = True Positives/ True positives + False Negatives
The following graphic does a phenomenal job visualizing the difference between precision and recall.
Regression metrics
Evaluation metrics for regression models are quite different than the above metrics we discussed for classification models because we are now predicting in a continuous range instead of a discrete number of classes. If your regression model predicts the price of a house to be $400K and it sells for $405K, that’s a pretty good prediction. However, in the classification examples we were only concerned with whether or not a prediction was correct or incorrect, there was no ability to say a prediction was “Pretty good”. Thus, we have a different set of evaluation metrics for regression models.