A language model is a model where we try to predict the next word of a sentence. A pretrained model can be useful in a similar fashion to how we would use a pretrained Imagenet model - a pretrained model knows a lot about what sentences look like, it might know some things about the world as well. We can leverage all this in the context of our task, irregardless of how much data we have in our training set.