What is the difference between Gradient Descent and Stochastic Gradient Descent?

Answer

Gradient Descent calculates the gradients on the entire dataset at one go. For Stochastic Gradient Descent (SGD) we calculate gradients on batches. Before moving onto the next batch, we modify our model's parameters based on the gradients. For each iteration through our dataset (which would be called an epoch) Gradient Descent would update the parameters once, SGD would perform as many updates as we have batches.