Hyperparameter Tuning
Description
Hyperparameter tuning is the process of finding the best settings for a model to achieve optimal performance.
Varieties
Training a CV involves testing all possible hyperparameters in each attempt. The process consists of the following steps:
- Define the hyperparameters and their search space: Identify the hyperparameters to optimize and specify their possible value ranges.
-
Choose a search strategy: Select a method to explore the hyperparameter search space, such as:
- Grid search: Systematically evaluates all possible hyperparameter combinations.
- Random search: Samples random combinations within the search space.
- Bayesian optimization: Uses a probabilistic model to guide the search, balancing exploration and exploitation.
-
Perform the search: Train a model using each combination of hyperparameter values and evaluate its performance.
- Select the best hyperparameters: Choose the combination that achieves the best performance based on the evaluation metric.
Example
from sklearn.model_selection import GridSearchCV
full_pipeline = Pipeline([
("preprocessing", preprocessing),
("random_forest", RandomForestRegressor(random_state=42)),
])
param_grid = [
{"preprocessing__geo__n_clusters": [5, 8, 10], "random_forest__max_features": [4, 6, 8]},
{"preprocessing__geo__n_clusters": [10, 15], "random_forest__max_features": [6, 8, 10]},
]
grid_search = GridSearchCV(full_pipeline, param_grid, cv=3, scoring="neg_root_mean_squared_error")
grid_search.fit(housing, housing_labels)
print(grid_search.best_params_)
print(grid_search.cv_results_)
Info
The GridSearchCV process in this example:
- First dictionary: \(3 \times 3 = 9\) combinations (3 values each for n_clusters and max_features)
- Second dictionary: \(2 \times 3 = 6\) combinations (2 values for n_clusters, 3 for max_features)
- Total combinations: \(9 + 6 = 15\) parameter sets
- With 3-fold cross-validation: \(15 \times 3 = 45\) total training rounds
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
full_pipeline = Pipeline([
("preprocessing", preprocessing),
("random_forest", RandomForestRegressor(random_state=42)),
])
param_distribs = {
"preprocessing__geo__n_clusters": randint(low=3, high=50),
"random_forest__max_features": randint(low=2, high=20)
}
rnd_search = RandomizedSearchCV(
full_pipeline,
param_distributions=param_distribs,
n_iter=10,
cv=3,
scoring="neg_root_mean_squared_error",
random_state=42
)
rnd_search.fit(housing, housing_labels)
print(rnd_search.best_params_)
print(rnd_search.cv_results_)
Info
RandomizedSearchCV is typically preferred over GridSearchCV for large hyperparameter spaces. Instead of exhaustively trying all combinations, it evaluates a fixed number of random combinations. This approach offers several advantages:
- Better exploration of continuous parameters: Can test many more values per parameter
- Efficiency with less important parameters: Adding an irrelevant parameter doesn't multiply training time
- Flexible iteration control: Can specify any number of iterations, avoiding combinatorial explosion