We have already talked about it in this post. Just want to add few more things after finishing a course. This post is just an extension of above with some practical considerations.
We are claiming that accuracy may not be a good measure always. When you are building automated machine learning you must trust it.
Case Study
- You want to show positive reviews on your website.
- Say in your dataset 90% reviews are negative.
- A classifier can achieve 90% accuracy by predicting all of them as negative.
- But what you are interested in is finding out remaining 10% and display it on your website.
Precision = Did I show something negative?
Recall = How good I am at finding positive reviews?
Analogy with Optimist and Pessimist
- Optimist assigns every/most review as positive
- Very good recall, but less precision
- Pessimist assigns every/most review with negative
- Bad recall, good precision
Trade-off
- Trade-off comes while scoring, not while training
- We can assign labels based on probabilities
- Decision tree gives probability by no of positive and negative samples at leaf node
- Logistic regression of-course gives probability
- We can change threshold to trade off between precision and recall
- Positive when prob > 1 => Pessimist
- Positive when prob > 0 => Optimist
Single no not always useful
- Single numbers like F1 score and AUC are something I am not great fan of
- You can not always choose classifier just by AUC, ROC curve might intersesct
- This intersection means that one classifier is better at some range of precision
- But if they don’t intersect we choose the one with higher AUC
- From business perspective we are should be clear whether we want more precision or recall
- Another practical metric they talked about was precision at k
- Say I want to display 5 reviews on my website
- What is the precision after 5 values I have chosen
