Assignment 8: Word2Vec

Deadline: December 13th, 9am

In this week, we will look at “the” classic model for learning word embeddings. This will be another tutorial-based assignment. Find the link here.

The key points are:

Questions for Understanding

As in the last assignment, answer these questions in your submission to make sure you understand what is happening in the tutorial code!

Possible Improvements & Extensions

Optional: CBOW Model

The tutorial only covers the Skipgram model, however the same paper also proposed the (perhaps more intuitive) Continuous Bag-of-Words model. Here instead of predicting the context from the center word, it’s the other way around. If you are looking for more of a challenge implementing a model by yourself, the changes should be as follows:

The rest stays pretty much the same. You will still need to generate negative examples through sampling, since the full softmax is just as inefficient as with the Skipgram model.

Compare the results of the CBOW model with the Skipgram one!