Question 5/13 v3 lecture 12

XResNet is based on the Bag of Tricks paper. What are the main improvements that it incorporates?


The improvements that XResNet adds over ResNet are:

Relevant part of lecture

supplementary material

Later in the lecture: Fp16 sometime trains a little bit better than fp32 - maybe it has some regularazing effect? Generally, the results are very close together