Natural Language Processing

Title Natural Language Processing (53113)
Quarter Summer 2020
Instructor Amitabh Chaudhary (amitabh@cs.uchicago.edu)
Website

Syllabus
 
Natural language processing (NLP) is the application of computational techniques, particularly from machine learning, to analyze and synthesize  human language. The recent explosion in the amount of available text data has made natural language processing invaluable for businesses, social sciences, and even natural sciences.
 
In this course we study the fundamentals of modern natural language processing, emphasizing models based on deep learning.   These include language models,  word embeddings,  recurrent neural networks (Simple RNNs, LSTMs), hidden Markov models, context-free grammars and syntactic parsing, dependency parsing, and attention-based models such as the transformer and BERT.
 
We use Python and Python based libraries such as PyTorch, NLTK, and SpaCy for implementing algorithms and processing text.
 
A significant component is the course project in which students apply NLP techniques to solve a real-world problem.
 

Topics
A tentative list of topics follows.
  • Language Models
  • Embeddings
  • Recurrent Neural Networks (RNNs), LSTMs for NLP
  • Conditioned Generation, Sequence to Sequence Models
  • Convolutional Neural Networks for NLP
  • Attention-Based Models, The Transformer
  • Hidden Markov Models
  • Sytactic Parsing, Context Free Grammars
  • Semantic Parsing
  • BERT 
Coursework and Evaluation
  • Assignments:  There are three assignments, approximately every two weeks.  They are designed to reinforce material and test a deeper understanding of the concepts and algorithms through theoretical questions, program implementation, and analysis of empirical results.  Worth 40% of the grade.
  • Midterm Examination: Worth 30% of the grade.
  • Course Project: Worth 30% of the grade.
Textbook
  • Neural Network Methods for Natural Language Processing by Yoav Goldberg (https://doi.org/10.2200/S00762ED1V01Y201703HLT037) 
Prerequisites (Courses)

A grade of B+ or better in the following courses:
• MPCS 50103 Math for Computer Science (or placement exam waiver)
• MPCS 51042 Python Programming

A grade of B or better in MPCS 55001 Algorithms

A grade of B or better in one of the following courses:
• MPCS 53110 Foundations of Computational Data Analysis
• MPCS 53120 Applied Data Analysis

MPCS 53111 Machine Learning (recommended; see below)
Equivalent courses or experience will be accepted with instructor permission. A prior course in machine learning would be useful but is not necessary; if you haven't taken any please contact the instructor with your prior courses and experience.

Prerequisites (Other)

Programming experience in Python.

Satisfies

Elective
DA-2 Specialization Requirement (https://masters.cs.uchicago.edu/page/data-analytics)

Time

Tuesday 5:30-8:30 PM

Location

TBD