| Lecture 1 |
Introduction |
|
| Lecture 2 |
Machine Translation |
|
| Lecture 3 |
Conditional Language Models |
|
| Lecture 4 |
Feedforward Language Models |
Section 5 of Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Neubig. |
| Lecture 5 |
Recurrent Neural Networks (RNN) |
• Sections 6 of Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Neubig. |
| • Backpropagation through Time, Jiang Guo. |
|
|
| Lecture 6 |
Modelling Data and Words |
• Neural Machine Translation of Rare Words with Subword Units (BPE Sennrich et al. 2016) |
| Lecture 7 |
Sequence-to-sequence Models with Attention |
• Sections 7 through 9 of Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Neubig. |
| Lecture 8 |
Transformers |
• Attention is all you need |
| • Transformers from Scratch |
|
|
| Lecture 9 |
Word Embeddings |
• Efficient Estimation of Word Representations in Vector Space |
| ****• Contextual word representations: A contextual introduction |
|
|
| • Chapter 6 of Speech and Language Processing. |
|
|
| Lecture 10 |
Pretrained Language Models |
• BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Devlin et al., NAACL 2019. |
| Lecture 11 |
Prompting with LLMs |
• Sections 1-4 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, Liu et al. (2021) |
| Lecture 12 |
Decoding with LLMs |
• The Curious Case of Neural Text Degeneration, Holtzman et al., 2020 |
| • Locally Typical Sampling, Meister et al., 2022 |
|
|
| Lecture 13 |
Neural Parsing |
Grammar as a Foreign Language, Vinyals et al., NeurIPS 2015. This is the encoder-decoder parsing model introduced in the lecture. |
| Lecture 14 |
Scaling laws for LLMs |
• Scaling Laws for Neural Language Models, Kaplan et al. 2020 |
| • Training Compute-Optimal Large Language Models, Hoffmann et al. 2022 (Chinchilla scaling laws) |
|
|
| Lecture 15 |
Safety and security with LLMs |
• A Watermark for Large Language Models, Kirchenbauer et al., 2023 (sections 1-3) |
| • Universal and Transferable Adversarial Attacks on Aligned Language Models, Zou et al., 2023 (sections 1-2) |
|
|
| Lecture 16 |
Evaluating Translation and Generation |
• Bleu: a method for automatic evaluation of machine translation, Papenini et al. (2002) |
| • COMET: A neural framework for MT evaluation, Rei et al. (2020) |
|
|
| Lecture 17 |
Machine Translation and Multilingual data |
• Multilingual Denoising Pre-training for Neural Machine Translation Liu et al. (2020) |
| Lecture 18 |
Question Answering |
• Speech and Language Processing Ed. 3, Ch. 14 on QA 🙂 |
| • SQuAD: 100,000+ Questions for Machine Comprehension of Text, https://arxiv.org/abs/1606.05250 (SQuAD) |
|
|
| Lecture 19 |
Ethics in NLP |
• The Social Impact of Natural Language Processing, Hovy and Spruit (2016) |
| Lecture 20 |
Bias in Embeddings and Language Models |
• Semantics derived automatically from language corpora contain human-like biases, Caliskan et al. 2017 |
| Lecture 21 |
LLM Alignment and Evaluation |
- Language Models are Few-Shots Learners: https://arxiv.org/abs/2005.14165 |