Gradient Boosting

Description

Gradient boosting is an ensemble model for classification and regression. It starts with a weak classifier (e.g., a simple tree) and improves it iteratively by focusing on errors from previous steps.

Advantages:

High accuracy
Supports regression & classification
Handles missing data & outliers
Works with various loss functions
Effective for high-dimensional data

Disadvantages:

Prone to overfitting with too many trees
Computationally expensive for large datasets
Requires careful hyperparameter tuning (trees, learning rate, depth)

Workflow

Each iteration:

Computes the negative gradient of the loss function.
Fits a decision tree to these values.
Combines predictions using a learning rate to control influence.

The final prediction is the weighted sum of all trees.