Question 27/30 fast.ai v3 lecture 10
What is a major limitation of BatchNorm?
It cannot be used for online training (batch size of 1). Anytime we have a small batch size we either will be unable to train or the training will be unstable. It will also be problematic for an RNN - how do you normalize a batch where each sequence can contain a variable number of words and where weights are tied?
Relevant part of lecture