Hyperparameter Tuning In Machine Learning

Hyperparameter Tuning In Machine Learning

Introduction

Hyperparameter are the set of parameters that are use for controlling the learning process of the machine learning algorithm. Hyperparameter tuning is the process of selecting a set of parameters for a machine learning algorithm. It is because algorithm can learn or identify the pattern in data efficiently and provide a good performing model.

Why do we need to perform hyperparameter tuning?

A machine learning algorithm may need different constraints or weights to identify the pattern present in the datasets. Training a machine learning model with default parameters may not be suitable for all kinds of data present in the datasets. Selecting the best parameter for an algorithm is essential as it determines the learning process of the algorithm and its performance. With the help of hyperparameter tuning, we can choose the best parameter for an algorithm so that model  can give a good prediction and perform well enough to solve a problem.

Things on hyperparameter tuning one should know

  • Hyperparameter tuning is computationally expensive
  • A small improvement in model performance
  • Time consuming

Hyperparameter tuning using GridSearchCV

This method tries all the possible permutation and combination of parameters will be to train the model and compute the corresponding accuracy of the model. This is time-consuming and computationally very expensive as this method use all the possible permutation and combination of the parameters.

For demonstration, I’ll use Jupyter Notebook and the heart disease prediction dataset is taken from Kaggle

import pandas as pd
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
df = pd.read_csv('heart.csv')
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
LR = LogisticRegression()
model = LR.fit(X_train, y_train)
pred = model.predict(X_test)
accuracy_score(y_test, pred)

Output

0.819672131147541

Logistic Regression gives an accuracy of 81% without hyperparameter tuning. Let’s take look at how the accuracy of the model increases after hyperparameter tuning.

Implementation of GridSearchCV

from sklearn.model_selection import GridSearchCV
parameters = {
    'penalty' : ['l1', 'l2', 'elasticnet', 'none'],
    'C' : [0.8, 0.9, 1.0, 1.2, 1.4],
    'solver': ['newton-cg','lbfgs', 'liblinear','sag', 'saga']
}
LR = LogisticRegression()
clf = GridSearchCV(LR, parameters)
clf.fit(X, y)
clf.best_params_

Output

{'C': 0.9, 'penalty': 'l2', 'solver': 'newton-cg'}

These are the parameters that are best suited for this model. Now let’s implement the model using these parameters and see by how much accuracy improves

LR = LogisticRegression(C = 0.9, penalty =  'l2', solver = 'newton-cg')
model = LR.fit(X_train, y_train)
pred = model.predict(X_test)
accuracy_score(y_test, pred)

Output

0.8360655737704918

After hyperparameter tuning, the accuracy of the model has increased from 81% to 83%.

Hyperparameter tuning using RandomizedSearchCV

RandomizedSearchCV use only randomly selected sets of parameters  to train the model and check the accuracy of the model. This is less time-consuming than the GridSearchCV. One of the main disadvantage of this method is the parameters given by this method may not be the best parameters as this method selects only some set of the parameters to check the performance of the model.

Note: We will use same model and same data for hyperparameter tuning with RandomizedSearchCV.

Implementation of RandomizedSearchCV

from sklearn.model_selection import RandomizedSearchCV
clf = RandomizedSearchCV(LR, parameters, n_iter= 6)
clf.fit(X, y)
clf.best_params_

Output

{'solver': 'newton-cg', 'penalty': 'l2', 'C': 1.2}

n_iter is use for selecting the number of combinations of the parameters  to evaluate the model i.e to check the accuracy of the model. Let’s use these parameters and check the accuracy of the model

LR = LogisticRegression(C = 1.2, penalty =  'l2', solver = 'newton-cg')
model = LR.fit(X_train, y_train)
pred = model.predict(X_test)
accuracy_score(y_test, pred)

Output

0.8360655737704918

So, the accuracy of the model has improved from 81% to 83% which is quite good.

from sklearn.model_selection import RandomizedSearchCV
clf = RandomizedSearchCV(LR, parameters, n_iter= 2)
clf.fit(X, y)
clf.best_params_

Output

{'solver': 'saga', 'penalty': 'l1', 'C': 1.4}

Using only 2 iterations we got these parameters as the best parameters. Let’s use these parameters to evaluate the model

LR = LogisticRegression(C = 1.4, penalty =  'l1', solver = 'saga')
model = LR.fit(X_train, y_train)
pred = model.predict(X_test)
accuracy_score(y_test, pred)

Output

0.6721311475409836

Using the parameters obtained above has degraded the accuracy of the model. So, RandomizedSearchCV sometimes might not end up with the best parameters for the model. It seems that n_iters must be chosen wisely to get the better parameters.

Conclusion

Hyperparameter tuning is an important step in machine learning. It is use for selecting the best parameters for a machine learning algorithm so that the algorithm can learn the pattern and perform efficiently to solve a problem. GridSearchCV makes all the possible combinations and permutations of parameters to select the best one while  RandomizedSearchCV randomly selects a set of parameters to select the best one among them. GridSearchCV is computationally expensive than RandomizedSearchCV but always ends up with the best parameters. Hence, hyperparameter tuning is very essential in machine learning.

Happy Learning 🙂

Leave a Reply