Deadline: January 15th, 9am
Following this link, you will find a selection of CIFAR10 datasets. The archive contains triplets of training, validation and test sets. For each, train a model on the training set, making sure it “works” by checking its performance on the validation set. When you’re satisfied (feel free to fine-tune your models to achieve as good of a validation set performance as possible), check performance on the test set. Does it still work? That is, is the performance on the test set close to that on the validation set? You should find that it is usually either significantly worse (which makes us sad) or too good (which is suspicious). For each triplet, find out (e.g. through inspection or computing statistics of the dataset) what’s going wrong. Typical things to watch out for include:
If your unfamiliar with .npz
files, see
here
for reference.
If you have spare time, you can try to put some of the insights you’ve gained from the lecture on “practical methodology” into practice. You can use a dataset such as CIFAR as a testing ground for your models. Some possibilities:
See whether, by following somewhat principled procedures, you have an easier time building high-performing models compared to “just trying stuff”.