CS 5134/6034: Natural Language Processing
University of Cincinnati
Spring 2024
TA: Dylan Hutson (hutsondm at mail.uc.edu)
Time: Tues&Thur 9:30 - 10:50 am
Location: RECCENTR 3250
Office Hour:
Tianyu Jiang, Tues 10:55-11:55 am, Rhodes Hall 889
Dylan Hutson, Wed 2:00-3:00 pm, Rhodes Hall 850E (inside the CEAS library)
Course Description
This course will provide a basic introduction to natural language processing (NLP). We will learn the fundamentals of different subfields within NLP, and study theoretical concepts and algorithms for various NLP problems. Topics covered include text classification, language modeling, word embeddings, sequence tagging, syntactic parsing, semantic parsing, question answering, and others. By the end of this course, you will have a good understanding of the research questions and methods in different areas of NLP, and have the skills to build NLP tools for new issues.
Grading
Assignments (4): 40%
Midterm Exam (in-class): 25%
Project: 35%
Bonus: 5% (class attendance)
Late Policy: 24 hour grace period with 10% penalty. No points after 24 hours.
Regrading Policy: Regrade requests must be made within two weeks of the score being posted on Canvas.
Electronic Submission: All assignments and project reports need to be submitted electronically via Canvas.
Prerequisites
This course assumes a good background in basic probability, statistics, linear algebra, and good programming skills in Python3. Prior knowledge of machine learning is helpful, but not required. The class is mainly for advanced undergraduates and graduate students in computer science, but we welcome other interested students with the necessary background and programming skills.
Textbook
Dan Jurafsky and James Martin. Speech and Language Processing, 3rd Edition (
Feb 3, 2024 draft).
Schedule (Tentative)
Week |
Date |
Topic |
Reading |
Assignment |
1
| 01/09
| Introduction to NLP
| Ch. 1 & 2
|
|
| 01/11
| Morphology
| Ch. 2
|
|
2
| 01/16
| N-gram Language Models
| Ch. 3
| a1 out
|
| 01/18
| N-gram contd.
| Ch. 3
|
|
3
| 01/23
| Naive Bayes
| Ch. 4
|
|
| 01/25
| Logistic Regression
| Ch. 5
|
|
4
| 01/30
| Part-of-Speech Tagging
| Ch. 8
| a1 due, a2 out
|
| 02/01
| HMM and Viterbi
| Ch. 8
|
|
5
| 02/06
| Sequence Labeling
| Ch. 8
| project instructions out
|
| 02/08
| Lexical Semantics
| Ch. 6
|
|
6
| 02/13
| Distributional Representations
| Ch. 6
| a2 due
|
| 02/15
| Word Embeddings
| Ch. 6
|
|
7
| 02/20
| Neural Networks for NLP
| Ch. 7
| proposal due, a3 out
|
| 02/22
| Recurrent Neural Network
| Ch. 9
|
|
8
| 02/27
| Machine Translation & Seq2Seq
| Ch. 13
|
|
| 02/29
| Attention
| Ch. 9 & 10
|
|
9
| 03/05
| NO CLASS
|
| a3 due
|
| 03/07
| Midterm Exam
|
|
|
10
| 03/12
| Spring Break
|
|
|
| 03/14
| Spring Break
|
|
|
11
| 03/19
| Post-exam Review
|
| intermediate report due, a4 out
|
| 03/21
| Transformers
| Ch. 10
|
|
12
| 03/26
| Pre-train and Fine-tune
| Ch. 11
|
|
| 03/28
| Pre-train and Fine-tune contd.
| Ch. 11
|
|
13
| 04/02
| Large Language Models
|
| a4 due
|
| 04/04
| Project Presentations
|
|
|
14
| 04/09
| Senior Design Day - NO CLASS
|
|
|
| 04/11
| Project Presentations
|
|
|
15
| 04/16
| Project Presentations
|
|
|
| 04/18
| Project Presentations
|
| slides due
|
| 04/19
| Classes End Day
|
| final report due
|
Project Resources