MPCS 53120 Applied Data Analysis (Winter 2023)

Section 1
Instructor(s) Singh Sahota, Davender (dsahota)
Location RY 277
Meeting Times Wednesday 5:30pm - 8:30pm
Fulfills Elective Specialization - High Performance Computing (HPC-2)

Syllabus

This course provides a self-contained introduction to computational data analysis from an applied perspective. It is intended as a standalone course for students who are not pursuing the full data analysis sequence in the MPCS. As such, students who have taken MPCS 53110 Foundations of Computational Data Analysis and received a grade of B or higher should take MPCS 53111 Machine Learning. Students that have taken MPCS 53111 Machine Learning cannot register for this class. 


The course will cover topics in basic probability theory, statistical inference, and basic machine learning models typically used in data analysis. Each topic will be accompanied by example illustrations using computational packages and software. Many of the topics covered form the basis of almost all algorithms and machine learning methods used in big data analysis. Emphasis will be given on using these techniques for problem solving.  

Textbook: An Introduction to Statistical Learning

Tentative List of Topics:

Elementary Probability and Statistics

  • Probability theory
  • Random variables
  • Distributions and densities

Software Platforms

  • Variables, objects, and functions in Python
  • Working with data frames
  • Data pre-processing and visualization

Statistical Inference and Learning

  • Bias-variance tradeoff
  • k-Nearest neighbors
  • Hypothesis testing

Linear Models

  • Least-squares regression
  • Logistic regression

Model Assessment and Selection

  • Feature Selection
  • Regularization

Machine Learning Models

  • Perceptron classifier
  • Neural networks
  • Decision trees/Random forests
  • Support vector machines

Clustering 

  • Unsupervised clustering
  • Supervised clustering

Recommender Systems

Introduction to Deep Learning

  • Computer Vision
  • Natural Language Processing

Course Prerequisites

MPCS 50103 Discrete Mathematics and Core Programming

Other Prerequisites

Knowledge of Python is required for this class. This course requires competency in Unix and Linux. If you attended the MPCS Unix Bootcamp you covered the required material. If you did not, please review the UChicago CS Student Resource Guide here: https://uchicago-cs.github.io/student-resource-guide/.

Overlapping Classes

This class is scheduled at a time that conflicts with these other classes:

  • MPCS 55001-2 -- Algorithms
  • MPCS 52011-1 -- Introduction to Computer Systems
  • MPCS 53110-1 -- Foundations of Computational Data Analysis
  • MPCS 51250-1 -- Entrepreneurship in Technology
  • MPCS 51230-2 -- User Interface and User Experience Design

Eligible Programs

Masters Program in Computer Science Bx/MS in Computer Science (Option 2: Professionally-oriented - CS Majors) Bx/MS in Computer Science (Option 3: Profesionally-oriented - Non-CS Majors)