CS 5134/6034: Natural Language Processing
University of Cincinnati
Fall 2024
TA: Saptarshi Ghosh (ghosh2si at mail.uc.edu)
Time: MWF 9:00 - 9:55 am
Location: RECCENTR 3230
Office Hour:
Tianyu Jiang, Mon 10-11 am, Rhodes Hall 889
Saptarshi Ghosh, Tue 11am - 12pm, Rhodes Hall 850E (within CEAS library)
Course Description
This course will provide a basic introduction to natural language processing (NLP). We will learn the fundamentals of different subfields within NLP, and study theoretical concepts and algorithms for various NLP problems. Topics covered include text classification, language modeling, word embeddings, sequence tagging, syntactic parsing, semantic parsing, question answering, and others. By the end of this course, you will have a good understanding of the research questions and methods in different areas of NLP, and have the skills to build NLP tools for new issues.
Grading
Assignments (3): 30%
Midterm Exam (in-class): 30%
Project: 40%
Bonus: 5% (class attendance)
Late Policy: 24 hour grace period with 10% penalty. No points after 24 hours.
Regrading Policy: Regrade requests must be made within two weeks of the score being posted on Canvas.
Electronic Submission: All assignments and project reports need to be submitted electronically via Canvas.
Prerequisites
This course assumes a good background in basic probability, statistics, linear algebra, and good programming skills in Python3. Prior knowledge of machine learning is helpful, but not required. The class is mainly for advanced undergraduates and graduate students in computer science, but we welcome other interested students with the necessary background and programming skills.
Textbook
Dan Jurafsky and James Martin. Speech and Language Processing, 3rd Edition (
Aug 20, 2024 draft).
Schedule
Week |
Date |
Topic |
Reading |
Assignment |
1 |
08/26 |
Welcome |
|
|
|
08/28 |
Introduction to NLP |
Ch. 1 & 2 |
|
|
08/30 |
Morphology |
Ch. 2 |
|
2 |
09/02 |
NO CLASS (Labor Day) |
|
a1 out |
|
09/04 |
N-gram Language Models |
Ch. 3 |
|
|
09/06 |
N-gram Language Models contd. |
|
|
3 |
09/09 |
Naive Bayes |
Ch. 4 |
|
|
09/11 |
Logistic Regression |
Ch. 5 |
|
|
09/13 |
Logistic Regression contd. |
|
|
4 |
09/16 |
Part-of-Speech Tagging |
Ch. 17 |
a1 due, a2 out |
|
09/18 |
HMM |
|
|
|
09/20 |
Viterbi |
|
|
5 |
09/23 |
Sequence Labeling |
Ch. 17 |
|
|
09/25 |
Sequence Labeling contd. |
|
|
|
09/27 |
Lexical Semantics |
Ch. 6 & Appendix G |
|
6 |
09/30 |
Distributional Representations |
Ch. 6 |
a2 due |
|
10/02 |
Word Embeddings |
Ch. 6 |
|
|
10/04 |
Neural Networks for NLP |
Ch. 7 |
|
7 |
10/07 |
Recurrent Neural Network |
Ch. 8 |
proposal due, a3 out |
|
10/09 |
Recurrent Neural Network contd. |
Ch. 8 |
|
|
10/11 |
Reading Day |
|
|
8 |
10/14 |
Machine Translation |
Ch. 13 |
|
|
10/16 |
Seq2Seq |
Ch.13 |
|
|
10/18 |
NO CLASS |
|
|
9 |
10/21 |
Midterm Exam |
|
a3 due |
|
10/23 |
Attention |
Ch. 9 |
|
|
10/25 |
Attention contd. |
|
|
10 |
10/28 |
Transformers |
Ch. 9 |
|
|
10/30 |
Transformers contd. |
|
|
|
11/01 |
Pre-train and Fine-tune |
Ch. 10&11 |
|
11 |
11/04 |
Pre-train and Fine-tune contd. |
|
intermediate report due |
|
11/06 |
Large Language Models |
Ch. 10&11 |
|
|
11/08 |
Prompting and In-Context Learning |
Ch. 12 |
|
12 |
11/11 |
Veterans Day |
|
|
|
11/13 |
NO CLASS |
|
|
|
11/15 |
NO CLASS |
|
|
13 |
11/18 |
Project Presentations |
|
|
|
11/20 |
Project Presentations |
|
|
|
11/22 |
Project Presentations |
|
|
14 |
11/25 |
Project Presentations |
|
|
|
11/27 |
Project Presentations |
|
|
|
11/29 |
Thanksgiving |
|
|
15 |
12/02 |
Project Presentations |
|
|
|
12/04 |
Project Presentations |
|
|
|
12/06 |
Project Presentations |
|
slides & final report due |
Project Resources