**residuals**).

We then train another simple model, but this time we use the residuals as targets. And we do this over and over again, training a small model, improving the residuals, training another small model, and so on until we reach some stopping criteria (for tree learners, that could be the maximum number of trees). This leaves us with a bunch of models that we do not average, but which we sum.