Lecture 1 |
Introduction |
|
Lecture 2 |
Machine Translation |
|
Lecture 3 |
Conditional Language Models |
|
Lecture 4 |
Feedforward Language Models |
Section 5 of Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Neubig. |
Lecture 5 |
Recurrent Neural Networks (RNN) |
• Sections 6 of Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Neubig. |
• Backpropagation through Time, Jiang Guo. |
|
|
Lecture 6 |
Modelling Data and Words |
• Neural Machine Translation of Rare Words with Subword Units (BPE Sennrich et al. 2016) |
Lecture 7 |
Sequence-to-sequence Models with Attention |
• Sections 7 through 9 of Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Neubig. |
Lecture 8 |
Transformers |
• Attention is all you need |
• Transformers from Scratch |
|
|
Lecture 9 |
Word Embeddings |
• Efficient Estimation of Word Representations in Vector Space |
****• Contextual word representations: A contextual introduction |
|
|
• Chapter 6 of Speech and Language Processing. |
|
|
Lecture 10 |
Pretrained Language Models |
• BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Devlin et al., NAACL 2019. |
Lecture 11 |
Prompting with LLMs |
• Sections 1-4 Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, Liu et al. (2021) |
Lecture 12 |
Decoding with LLMs |
• The Curious Case of Neural Text Degeneration, Holtzman et al., 2020 |
• Locally Typical Sampling, Meister et al., 2022 |
|
|
Lecture 13 |
Neural Parsing |
Grammar as a Foreign Language, Vinyals et al., NeurIPS 2015. This is the encoder-decoder parsing model introduced in the lecture. |
Lecture 14 |
Scaling laws for LLMs |
• Scaling Laws for Neural Language Models, Kaplan et al. 2020 |
• Training Compute-Optimal Large Language Models, Hoffmann et al. 2022 (Chinchilla scaling laws) |
|
|
Lecture 15 |
Safety and security with LLMs |
• A Watermark for Large Language Models, Kirchenbauer et al., 2023 (sections 1-3) |
• Universal and Transferable Adversarial Attacks on Aligned Language Models, Zou et al., 2023 (sections 1-2) |
|
|
Lecture 16 |
Evaluating Translation and Generation |
• Bleu: a method for automatic evaluation of machine translation, Papenini et al. (2002) |
• COMET: A neural framework for MT evaluation, Rei et al. (2020) |
|
|
Lecture 17 |
Machine Translation and Multilingual data |
• Multilingual Denoising Pre-training for Neural Machine Translation Liu et al. (2020) |
Lecture 18 |
Question Answering |
• Speech and Language Processing Ed. 3, Ch. 14 on QA 🙂 |
• SQuAD: 100,000+ Questions for Machine Comprehension of Text, https://arxiv.org/abs/1606.05250 (SQuAD) |
|
|
Lecture 19 |
Ethics in NLP |
• The Social Impact of Natural Language Processing, Hovy and Spruit (2016) |
Lecture 20 |
Bias in Embeddings and Language Models |
• Semantics derived automatically from language corpora contain human-like biases, Caliskan et al. 2017 |
Lecture 21 |
LLM Alignment and Evaluation |
- Language Models are Few-Shots Learners: https://arxiv.org/abs/2005.14165 |