NLP @ NUP (Spring 2024)

This intensive course aims to introduce the foundational methods, tools, and building blocks proven by modern natural language processing (NLP) applications.

  • Instructor: Dr. Dmitry Ustalov
  • Time: Thursdays at 6:30pm EET/EEST (aka 5:30pm CET/CEST)
  • Location: Online (Google Meet)

This course is organized in partnership between the Neapolis University Pafos and JetBrains.

Neapolis University Pafos
JetBrains Academy

Topics

N-Grams
History of Field. Text Processing. Language Models and Resources. N-Grams and Smoothing. Perplexity.
Information Retrieval
Search Problem. Inverted Index. Vector Space Model. Boolean Retrieval. Ranked Retrieval. Learning-to-Rank. TREC.
Evaluation
Problem of Benchmarking. Human and Model-Based Evaluation. Statistical Analysis and Testing. Label Reliability. Ablations. Red Teaming.
Latent Representations
Distributional Semantics. Pointwise Mutual Information. Latent Semantic Analysis. Word Embeddings. Similarity, Analogies, and Lexical Semantics. Vector Search.
Transformer
Attention. Transformer. BERT and RoBERTa. GPT-1 and GPT-2. Not Transformer.
Large Language Models (LLMs)
Pre-Training, Fine-Tuning, Alignment. Low-Rank Adaptation and Quantization. Prompting. Retrieval Augmented Generation (RAG). Leaderboards.

Classes

TopicLecture Date
1N-Grams2024-03-21
2Information Retrieval2024-03-28
3Evaluation2024-04-04
4Latent Representations2024-04-11
5Transformer2024-04-18
6Large Language Models2024-04-25

No commercial use allowed. Please acknowledge this page for all other uses.

Assignments

TopicSeminar DateDeadline
1Search Engine2024-04-042024-04-25
2Question Answering2024-04-252024-05-16

Assignments are available only to the enrolled students. The solutions should be submitted to Kaggle by the end of the deadline day (AoE time zone). Please grant read access to the notebooks with your solutions to the course staff.

Grading

  • The course contains two assignments, and you must complete both to pass
  • Assignments are graded automatically using the Kaggle leaderboard
  • Your solutions must score higher than the baseline scores set by course staff
  • The use of large language models (LLMs) for doing the assignments is permitted, but you are expected to be able to explain every single line of your code

Resources