F1 [Imbalance]
Description
Is the harmonic mean of precision and recall taking both metrics into account.
It is an important metric in the context of imbalanced datasets as it considers both false positives and false negatives. Spanning from 0 to 1, with 1 representing the optimal outcome.
- We use the harmonic mean instead of a simple average because it punishes extreme values
- A classifier with a precision of 1.0 and a recall of 0.0 has a simple average of 0.5 but an F1-Score of 0
We have F1 micro and F1 macro. F1 micro and F1 macro are two ways to compute the F1 score for multi-class or multi-label classification problems. They aggregate precision and recall differently, leading to different interpretations of the classifier's performance.
F1 Macro
F1 macro computes the F1 score for each class independently and then takes the average of those values. This approach treats each class as equally important and does not consider the class imbalance.
F1 macro is particularly useful when you want to evaluate the performance of a classifier across all classes without giving more weight to the majority class. However, it may not be suitable when the class distribution is highly imbalanced as it can provide an overly optimistic estimate of the model's performance.
F1 Micro
F1 micro, on the other hand, aggregates the contributions of all classes to compute the F1 score. It does this by calculating the global precision and recall values across all classes and then computing the F1 score based on these global values. F1 micro takes class imbalance into account as it considers the number of instances in each class.
F1 micro is useful when you want to evaluate the overall performance of a classifier considering the class distribution, especially when dealing with imbalanced datasets.
Formula
Normal:
Macro:
Micro:
The micro-averaged F1 score is based on the global precision and recall:
Where: