MPCS 53120 Applied Data Analysis (Spring 2017)

Section 1
Instructor(s) Mayampurath, Anoop (anmayampurath)
Location Ryerson 276
Meeting Times Tuesday 5:30pm - 8:30pm
Fulfills Elective Specialization - High Performance Computing (HPC-2)

Syllabus

This course provides a self-contained introduction to computational data analysis from an applied perspective. It is intended as a standalone course for students who do not want to pursue the full data analysis sequence in the MPCS. As such, students who have taken or are taking MPCS 53111 Machine Learning cannot register for this class. Students who have taken MPCS 53110 Foundations of Computational Data Analysis must obtain MPCS administration approval before registering for this class.  

The course will cover topics in basic probability theory, statistical inference, and basic machine learning models typically used in data analysis. Each topic will be accompanied by example illustrations using computational packages and software. Many of the topics covered form the basis of almost all algorithms and machine learning methods used in big data analysis. Emphasis will be given on using these techniques for problem solving.  All work will be done in R (https://www.r-project.org/about.html). 


*Please note: This is an initial draft of this course. Information is subject to change.*

Week 1: Elementary Probability Statistics

  • Course overview
  • Probability theory
  • Random variables
  • Distributions and densities
Week 2: Software Platforms
  • Variables, objects, and functions in R
  • Working with data frames
  • Data pre-processing and visualization
Week 3: Linear Models/Statistical Inference
  • Least-squares regression
  • Logistic regression
  • Hypothesis testing
Week 4: Model Assessment and Selection

Week 5: Machine Learning Models I
  • Perceptron classifier
  • Neural networks
  • Decision trees/Random forests
Week 6: Midterm

Week 7: Clustering 
  • Unsupervised clustering
  • Supervised clustering
Week 8: Machine Learning Models II
  • Support vector machines
Week 9 : Computational Frameworks
  • Common machine learning frameworks
  • Big data analytics
Week 10: Project presentations

 

Course Prerequisites

MPCS 50103 (Discrete Mathematics) and Core Programming

Other Prerequisites

Overlapping Classes

This class is scheduled at a time that conflicts with these other classes:

  • MPCS 55001-1 -- Algorithms