Reading Assignment: Advanced CNN Architectures

Main Reading

Start with the Bishop Book - Sections 10.4 and 10.5. (The content of Section 10.3 will be covered later in the course. Section 10.6. is optional. It describes an early approach for style transfer that required an optimization problem to be solved for each input and thus was quite expensive compared to today’s style transfer techniques.)

After these two specific applications of CNNs, continue with the high-level overview provided in this blog post by Adit Deshpande. Anything beyond the section on region-based CNNs is optional. (GANs are covered in the “Learning Generative Models” course and we will address Transformers later in the course.)

Next, read Densely connected convolutional networks (2016) by Huang et al. There is a lot to learn from this paper as the authors do a very good job pointing out similarities and differences with many related approaches. You can skip the experiments in Section 4.

Optional Reading

Here are some papers that use (and extend) CNNs in various settings. They are all worth reading if you want to dive deeper into the topic.

  1. End-to-end Learning for Music Audio Tagging at Scale (2018), J. Pons et al.

  2. Xception: Deep Learning with Depthwise Separable Convolutions (2016), F. Chollet

  3. Language Modeling with Gated Convolutional Networks (2016), Y. N. Dauphin et al.

  4. You only look once: Unified, real-time object detection (2016), J. Redmon et al.

  5. Harmonic Convolutional Networks based on Discrete Cosine Transform (2021), M. Ulicny et al.

Here are some general questions to guide you during (selective) reading:

Further Optional Reading

If you would like to dig deeper, here are some more ressources: