How do we know when convergence has been achieved? Presumably it’s the difference between successive iterations, but at what difference do we decide the process has converged? Is it a difference of 0.01, 0.0001, etc.? Is there a rule of thumb for determining what this threshold?
When training a neural network, we usually compute a training loss and a validation loss. The training loss might continue to decrease, but at one point the validation loss starts to increase. This is where we should stop the training process to avoid model overfitting:
One way to avoid model overfitting is to limit the number of training iterations (epochs). Another way is to use early stopping, which is an automatic mechanism for stopping the training process as soon as the loss begins to increase. We will discuss these concepts in greater detail later in the course.
The 365 Team