Deadline: November 28th, 9am
In this task, you will implement a simple NMT with attention for a language pair
of your choice.
We will follow the corresponding
TF Tutorial on NMT.
Please do not just use the exemplary English-Spanish example to reduce temptation
of simply copying the tutorial.
You can find data sets here. We recommend
picking a language pair where you understand both languages (so if you do speak
Spanish… feel free to use it ;)).
This makes it easier (and more fun) for you to evaluate the results.
However, keep in mind that some language pairs have a very large amount of examples,
whereas some only have very few, which will impact the learning process and the
quality of the trained models.
You may run into issues with the code in two places:
load_data
function might crash. It expects each line to result in
pairs of sentences, but there seems to be a third element which talks about
attribution of the example (at least if you download a different dataset from
the link above). If this happens, you can use line.split('\t')[:-1]
to
exclude this in the function.Tasks:
AdditiveAttention
)Attention
)Compare the attention weight plots for some examples between the attention
mechanisms.
We recommend to add ,vmax=1.0
when creating the plot in
ax.matshow(attention, cmap='viridis')
in the plot_attention
function
so the colors correspond to the same
attention values in different plots.
Here are a few questions for you to check how well you understood the tutorial.
Please answer them (briefly) in your solution!
die
and to the
English word die
?Hand in all of your code, i.e. the working tutorial code along with all changes/additions you made. Include outputs which document some of your experiments. Also remember to answer the questions above! Of course you can also write about other observations you made.