Keras comes with a long list of predefined callbacks that are ready to use. Keras callbacks are functions that are executed during the training process.
According to Keras Documentation, A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training. You can pass a list of callbacks (as the keyword argument callbacks) to the .fit() method of the Sequential or Model classes. The appropriate methods of the callbacks will then be called at each stage of the training.
In this article, we’ll discuss some of the commonly used callbacks in Keras.
MODEL CHECKPOINT:
The first callback we are going to discuss is the model checkpoint.
Sometimes, training a deep neural network might take days. We don’t want to lose all our progress if there’s a power outage.
By using model checkpoint callback, we can save our model at regular intervals.
The model checkpoint callback saves the weights of the model along with the structure once we made progress, like hitting a new validation accuracy or loss. We could also save the model for every epoch.
The signature of the callback is as follows
1 |
ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1) |
- monitor: quantity to be monitored
- save_best_only: if TRUE saves only the best model according to the quantity monitored
- Save_weights_only: if TRUE only the weights of the model are saved
- mode: one of {auto, min, max}. In min mode, training will stop when the quantity monitored has stopped decreasing; in max mode, it will stop when the quantity monitored has stopped increasing; auto mode, the direction is automatically inferred from the name of the monitored quantity.
- period: after how many epochs you need to checkpoint the model
EARLY STOPPING:
By stopping the training of our model early, we can prevent our model from overfitting.
For instance, our model might keep reducing its loss in the training data and keep increasing its loss in the validation data. This is a sign of overfitting.
By using the early stopping callback, we can monitor specific metrics like validation loss or accuracy. As soon as the chosen metric stops improving for a fixed number of epochs, we are going to stop the training.
1 |
EarlyStopping(monitor='val_loss', min_delta=0, patience=0, mode='auto') |
- min_delta: minimum change in the monitored quantity to qualify as an improvement
- patience: number of epochs that produced the monitored quantity with no improvement after which training will be stopped
REDUCE LR ON PLATEAU:
This callback can be used to reduce the learning rate of our model in case the loss is plateaued.
1 |
ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5, mode='auto', min_delta=0.0001) |
- factor: a value of 0.1 divides the learning rate by 10
CSV LOGGER:
This callback logs the output of the model to a CSV file.
1 |
CSVLogger(filename, separator=',', append=True) |
- filename: name of the CSV file
TENSORBOARD:
Tensorboard is a framework that can be used to troubleshoot or monitor our machine learning algorithms.
It offers a suite of visualization tools for visualizing the network’s computational graph as well as the model’s training progress.
Sometimes it might be helpful to keep track of network parameters while training a neural network. With the help of TensorBoard, we can visualize metrics like loss, accuracy, and batch training time over iterations.
In Keras, we can use the Tensorboard callback to save the network parameters and analyze them.
1 |
TensorBoard(log_dir='./logs', write_graph=True, update_freq='epoch') |
- log_dir: path to save the log files
- write_graph: if TRUE, visualize graph in tensorboard
- update_freq: ‘batch’ or ‘epoch’ or integer. When using ‘batch’, writes the losses and metrics to TensorBoard after each batch. The same applies to ‘epoch’. If using an integer, let’s say 10000, the callback will write the metrics and losses to TensorBoard every 10000 samples.
Tensorboard can be invoked from the command line as follows
1 |
tensorboard --logdir=/full_path_to_your_logs |
CUSTOM CALLBACK:
Along with the list of predefined callbacks that are ready to use, we can also create custom callbacks in Keras.
We can create a simple custom callback using the LambdaCallback.
The signature of the lambda callback is as follows.
1 |
LambdaCallback(on_epoch_begin=None, on_epoch_end=None, on_batch_begin=None, on_batch_end=None, on_train_begin=None, on_train_end=None) |
- on_epoch_begin: called at the beginning of every epoch.
- on_epoch_end: called at the end of every epoch.
- on_batch_begin: called at the beginning of every batch.
- on_batch_end: called at the end of every batch.
- on_train_begin: called at the beginning of model training.
- on_train_end: called at the end of model training.
Now let’s do some coding
First, we’ll import all the necessary packages and the dataset. We’ll use the fashion MNIST dataset.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import keras import matplotlib.pyplot as plt from keras.callbacks import EarlyStopping from keras.callbacks import ModelCheckpoint from keras.callbacks import CSVLogger from keras.callbacks import ReduceLROnPlateau from keras.callbacks import TensorBoard from keras.callbacks import LambdaCallback from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras import backend as K from keras.utils import np_utils from keras.datasets import fashion_mnist |
Next, let’s build our CNN model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
#https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py batch_size = 128 num_classes = 10 epochs = 15 # input image dimensions img_rows, img_cols = 28, 28 # the data, split between train and test sets (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data() if K.image_data_format() == 'channels_first': x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) input_shape = (1, img_rows, img_cols) else: x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # convert class vectors to binary class matrices y_train = np_utils.to_categorical(y_train, num_classes) y_test = np_utils.to_categorical(y_test, num_classes) #Building our CNN model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation='softmax')) #compile the model model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) |
Once we defined our model, we can now create the callbacks
1 2 3 4 5 6 |
filepath = 'C:/Users/Niranjan/Desktop/keras/weights-{epoch:02d}-{val_acc:.2f}.hdf5' early_stop = EarlyStopping(monitor='val_loss',patience=7, verbose=1, mode='auto') model_ckpt = ModelCheckpoint(monitor='val_loss', save_best_only=True, verbose=1, mode='auto',filepath=filepath) csv_log = CSVLogger('training.log', append=False) rlrp = ReduceLROnPlateau(monitor='val_loss',factor=0.1, mode='min', patience=5, verbose=1) tb = TensorBoard(log_dir='./logs', histogram_freq=0, write_graph=True, write_images=False) |
We’ll create a custom callback which stops the training of the model once the validation accuracy reaches 90%
1 2 3 4 5 |
def on_epoch_end(_,logs): THRESHOLD = 0.90 if(logs['val_acc']> THRESHOLD): model.stop_training=True stop_train = LambdaCallback(on_epoch_end=on_epoch_end) |
Finally, we can pass the callbacks object into the model fit method as a list.
1 2 3 4 5 6 |
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.2, callbacks=[early_stop, model_ckpt, csv_log, rlrp, tb, stop_train]) |
Below is the complete code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
import keras import matplotlib.pyplot as plt from keras.callbacks import EarlyStopping, ModelCheckpoint, CSVLogger, ReduceLROnPlateau, TensorBoard, LambdaCallback from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras import backend as K from keras.utils import np_utils from keras.datasets import fashion_mnist batch_size = 128 num_classes = 10 epochs = 50 # input image dimensions img_rows, img_cols = 28, 28 # the data, split between train and test sets (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data() if K.image_data_format() == 'channels_first': x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) input_shape = (1, img_rows, img_cols) else: x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # convert class vectors to binary class matrices y_train = np_utils.to_categorical(y_train, num_classes) y_test = np_utils.to_categorical(y_test, num_classes) #Building our CNN model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation='softmax')) #compile the model model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) #defining our callbacks filepath = 'C:/Users/Niranjan/Desktop/python/BLOG/KERAS/weights-{epoch:02d}-{val_acc:.2f}.hdf5' early_stop = EarlyStopping(monitor='val_loss',patience=7, verbose=1, mode='auto') model_ckpt = ModelCheckpoint(monitor='val_loss', save_best_only=True, verbose=1, mode='auto',filepath=filepath) csv_log = CSVLogger('training.log', append=False) rlrp = ReduceLROnPlateau(monitor='val_loss',factor=0.1, mode='min', patience=5, verbose=1) tb = TensorBoard(log_dir='./logs', histogram_freq=0, write_graph=True, write_images=False) def on_epoch_end(_,logs): THRESHOLD = 0.90 if(logs['val_acc']> THRESHOLD): model.stop_training=True stop_train = LambdaCallback(on_epoch_end=on_epoch_end) history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.2, callbacks=[early_stop, model_ckpt, csv_log, rlrp, tb, stop_train]) score = model.evaluate(x_test, y_test, verbose=1) print('Test loss:', score[0]) print('Test accuracy:', score[1]) |
CONCLUSION:
In this article, we discussed some of the most commonly used callbacks in Keras.
This is not the complete list of callbacks that Keras offers. I’d encourage you to visit Keras Documentation to take a look at available callbacks in Keras.