backward()on a result of a calculation backpropagates the gradients. What will happen if we call
backward()multiple times in succession?
Relevant part of lecture
The flipside is that if you forget to zero the gradients, they will continue to grow indefinitely (initially your model might train but at some point the training will diverge).