Reading Assignment 7: Autoregressive Models & LLMs

Autoregressive Models

Chapter 22 from Probabilistic Machine Learning: Advanced Methods gives a good overview of autoregressive models. Since these models are conceptually very straightforward, there is not too much deep theory here.

(The Bishop Book only briefly covers the topic in Section 12.2.4.)

Optional Reading: Classic Success Stories

Consider these optional reading in case you want to get some more concrete examples. Some of these may be treated in more detail later in the class.

Large Language Models

There are many language models being developed by large companies and research groups. Most of these function similarly. We will look at a select few only. As before, this is a lot to read, so the optional papers below are mainly included as a reference.

Section 12.3.5 of the Bishop Book provides a high-level overview.

For more details, refer to this HUGE overview paper with A Survey of Large Language Models.

Finally, a very important piece of research, providing justification for this research agenda, is Scaling Laws for Neural Language Models.

Optional: Model Examples

GPT series of papers:

As these are all developed by OpenAI, you could also check PaLM by Google.

Reinforcement Learning with Human Feedback

An important technique in fine-tuning LLMs, especially for human interaction.