v4 lecture 1

  1. What do you NOT need to do deep learning? (list 3 things)
  2. Are neural networks a recent invention?
  3. What are some of the components of a system that can learn?
  4. What can a sufficiently large network with one hidden layer learn?
  5. What is a good way to learn to play baseball?
  6. What is machine learning? How does it compare to regular programming?
  7. ML Jargon - list some common ML terms
  8. What is a positive feedback loop?
  9. What is the most important thing to do in the top-down approach to learning? v4 lecture 2

  1. What is the difference between a loss and a metric?
  2. Does using a validation set guarantee we will not overfit?
  3. Why transfer learning?
  4. What is p-value?
  5. What do we need to call on a pathlib.Path to delete the file? v4 lecture 3

  1. When is it easier to clean your data - before or after training? Why?
  2. What are two common problems regarding data your model can encounter in prod?
  3. Can you collect unbiased data?
  4. What are the three steps of a careful deployment strategy?
  5. How to improve your model based on new data?
  6. What is one of the most challenging things (and why) when deploying ML models?
  7. When are you at risk of introducing a feedback loop?
  8. What is a very good way to prevent the introduction of feedback loops?
  9. Why you should start blogging? (list 3 reasons)
  10. What to blog about?
  11. Why should you start each project with creating a baseline? v4 lecture 4

  1. How do you reshape tensors in PyTorch?
  2. What is the difference between a metric and a loss?
  3. How to change a value of a tensor in PyTorch without triggering gradient calculation?
  4. What is the difference between Gradient Descent and Stochastic Gradient Descent?
  5. What are the two advantages of presizing, an augmentation method for images?
  6. Using fastai, what is a good way of evaluating you set up image augmentations correctly?
  7. How to list the steps (and their configuration) that a DataBlock applies to data? v4 lecture 5

  1. When can feedback loops occur and what are their consequences?
  2. Do AI algorithms that are so resilient to errors need human supervision?
  3. What are 3 examples of unintended consquences of the tech that you build?
  4. In the Volkswagen diesel cheating case, who was the first person to end up in prison?
  5. How does bureaucracy assign responsibility and why it is relevant to AI?
  6. Is ethics culturally dependent?
  7. What is algorithmic colonialism?
  8. Can you absolutely trust the data you are gathering? Why?
  9. Why is it important to understand the issues around the use of metrics to track performance?
  10. What are our online environments designed to be like?
  11. What are the potential sources of bias in your data / modelling pipeline?
  12. What is representation bias?
  13. Often unethical behavior is driven by management - what can an individual engineer do?
  14. What is evaluation bias?
  15. What is historical bias?
  16. What is measurement bias?
  17. Does even a bit of diversity help?
  18. Why does algorithmic bias matter?
  19. How do we debias our data or ensure it is bias free?
  20. What is disinformation? (list the three things it involves)
  21. What is the new form of censorship?
  22. What is Ethical Risk Sweeping?
  23. What is Expanding the Ethical Circle?
  24. What is the Think About the Terrible People exercise about?
  25. What is the objective of the Closing the Loop: Ethical Feedback & Iteration activity?
  26. Why having a diverse team makes you much more effective?
  27. Why do we need policy (list 4 reasons) AND ethical industry behavior (list 2 reasons)? v4 lecture 6

  1. What is a Dataset in fastai?
  2. What is Datasets in fastai?
  3. In Computer Vision, how are the dimensions of an image given?
  4. In Deep Learning, what can a matrix lookup be replaced with?
  5. What is the name of the computational shortcut to a matrix lookup done through multiplication? v4 lecture 7

  1. How does L2 regularization (weight decay) work?
  2. Using the sigmoid trick with a model that should be able to predict values up to 5 - what max value do you need to set?
  3. What do we call a variable that can be ordered? (e.g. small, medium, big). How do you convey this to pandas?
  4. A key technique to modern machine learning goes by the name of bagging. Who invented it? How does it work?
  5. Does adding more predictors (trees) to Random Forest increase the risk of overfitting?
  6. What is a partial dependence plot?
  7. What is the extrapolation problem in the context of Random Forests?
  8. In the context of a Random Forest, how to find out of domain data?
  9. What is boosting?
  10. Can embeddings learned by neural networks be useful to other types of models (i.e. Random Forests or KNN classifiers)? v4 lecture 8

  1. What is a language model?
  2. What are the three approaches to tokenization?
  3. For NLP, is it better to unfreeze many layers at once or gradually unfreeze layers, one by one?
  4. What is a stateful RNN?
  5. What is activation regularization (AR)?
  6. What is temporal activation regularization (TAR)?
  7. What is weight tying? v3 lecture 8

  1. What should you focus on when starting to construct a machine learning model?
  2. What does it mean for a model to overfit?
  3. What actions should you take when a model overfits?
  4. How does numpy perform matrix - scalar comparisons?
  5. What is the name of the procedure that matches the shape of one matrix to another?
  6. Changing the dimensionality of a matrix
  7. Which set (train, validation or test) should you use to calculate statistics for normalization?
  8. As demonstrated in research, what is the trick that enables training a 10_000 layer deep neural network?
  9. When initializing weights for multiplying relu activations, do you need to take any additional precautions?
  10. What to be aware of when doing tensor.squeeze()? v3 lecture 9

  1. What will be the effect of backpropagating the loss multiple times in PyTorch?
  2. How to take an arbitrary action when setting the value of an attribute on a Python object?
  3. Which way of calculating the cross entropy loss is faster?
  4. What issues can arise when running calculations with large floating point numbers?
  5. What is the benefit of the LogSumExp trick?
  6. How do iterators behave in Python?
  7. What does torch.randperm do?
  8. What do model.train() and model.eval() in PyTorch do?
  9. What is "proclaim_victory" in the code above?
  10. How to safely access an attribute that might not exist on a given instance in Python?
  11. How does __getattr__ work in Python?
  12. In Python, what is a decorator?
  13. How does monkey patching work in Python? v3 lecture 10

  1. Assume you are not keeping up with everything that is covered in the course. Should you worry?
  2. Why are callbacks great for researchers?
  3. What is a closure?
  4. How can you set the value of some arguments to a function in Python without calling it?
  5. What method do you need to define to make a Python object callable?
  6. What method do you need to define to make a Python object subscriptable?
  7. How to specify a finalizer (a method that will be called when an instance is being destroyed)?
  8. What method do you need to define to specify entering the runtime context of an object?
  9. What is the name of the dunder method returning the length of an object?
  10. What method should ideally return a string representation of an object that could be used to recreate it?
  11. What is the method that should compute an "informal" or nicely printable string representation of an object.
  12. What do you need to define to implement additions between instances of a class?
  13. What does variance (informally) measure?
  14. What would taking the mean of the absolute values of differences between values in a tensor and their mean calculate?
  15. How is standard deviation calculated?
  16. Which is more sensitive to outliers, mean absolute deviation or standard deviation?
  17. Why do mathematicians and statisticians use standard deviation more often than mean absolute deviation?
  18. What is covariance?
  19. What does the product of differences between values in two tensors and their respective means calculate?
  20. What is Pearson correlation coefficient?
  21. When should you and shouldn't you use softmax?
  22. For classification, what is a good way of handling items in your dataset that don't belong to any of the classes?
  23. For BatchNorm, what are the normalization statistics calculated on during training?
  24. If you have a single channel image and are running a 3x3 conv on it, why is opting for 8 channels problematic?
  25. In a class inheriting from `nn.Module`, what does `self.register_buffer` do?'
  26. How to calculate moving average with an exponential decay in PyTorch?
  27. What is a major limitation of BatchNorm?
  28. How does LayerNorm differ from BatchNorm?
  29. What are two ways of fixing the small batch problem that affects BatchNorm?
  30. In the context of exponential moving average, what debiasing or bias correction mean? v3 lecture 11

  1. What is LSUV (layerwise sequential unit variance)?
  2. What is a big part of getting good at using deep learning in any given domain?
  3. What Python tools allow you to work with files / directories in a way that is fast?
  4. What can you sort using `sorted`?
  5. How can you handle your model being run at inference on classes it has not seen during training?
  6. How do you create a class method in Python?
  7. What are PyTorch hooks?
  8. Does L2 regularization have a regularizing effect when used with BatchNorm?
  9. What is LAMB?
  10. When resizing an image for training, which downsampling algorithm should you use? Nearest or bilinear?
  11. Are operations faster on Byte or Float tensors?
  12. What is a fantastically useful data augmentation technique that can be applied across multiple domains?
  13. In general, does reflection padding help models? v3 lecture 12

  1. Debugging machine learning code is hard. Does it make sense to follow strict formatting guidelines?
  2. What is mixup and what are its uses?
  3. What is label smoothing?
  4. When training with fp16, are all calculations done using half-precision floats?
  5. What are the main improvements (over ResNet) implemented in XResNet?
  6. What is the most important technique to apply to your code, regardless if you are doing research or moving a model to...
  7. What do you need to call on an `nn.Module` to export its parameters, registered buffers, etc to a dictionary of values?
  8. What is a very important aspect that is often overlooked when doing transfer learning?
  9. What are Jeremy Howard's tips for debugging deep learning?
  10. What is a scientific journal?
  11. What are the two activation functions used inside the LSTM cell?
  12. How to split a tensor into some number of pieces of equal size?
  13. Which state or states does the AWD-LSTM in grab to perform classification?