Section | 1 |
---|---|

Instructor(s) | Singh Sahota, Davender (dsahota) |

Location | Online Only |

Meeting Times | Thursday 5:30pm - 7:30pm |

Fulfills | Elective Specialization - High Performance Computing (HPC-2) |

***This course will be conducted remotely and will be online only for Spring 2021***

This course provides a self-contained introduction to computational data analysis from an applied perspective. It is intended as a standalone course for students who are not pursuing the full data analysis sequence in the MPCS. *As such, students who have taken MPCS 53110 Foundations of Computational Data Analysis and received a grade of B or higher should take MPCS 53111 Machine Learning. Students that have taken or are currently enrolled in MPCS 53111 Machine Learning cannot register for this class. *

The course will cover topics in basic probability theory, statistical inference, and basic machine learning models typically used in data analysis. Each topic will be accompanied by example illustrations using computational packages and software. Many of the topics covered form the basis of almost all algorithms and machine learning methods used in big data analysis. Emphasis will be given on using these techniques for problem solving.

Textbook: An Introduction to Statistical Learning, 1st Edition - https://www.statlearning.com/Week 1: Elementary Probability Statistics

- Course overview
- Probability theory
- Random variables
- Distributions and densities

- Variables, objects, and functions in Python
- Working with data frames
- Data pre-processing and visualization

- Least-squares regression
- Logistic regression
- Hypothesis testing

Week 5: Machine Learning Models I

- Perceptron classifier
- Neural networks
- Decision trees/Random forests

Week 7: Clustering

- Unsupervised clustering
- Supervised clustering

- Support vector machines

- Common machine learning frameworks
- Big data analytics

MPCS 50103 Discrete Mathematics and Core Programming

Knowledge of Python is required for this class.

This class is scheduled at a time that conflicts with these other classes:

- MPCS 53001-1 -- Databases
- MPCS 52060-2 -- Parallel Programming
- MPCS 52030-1 -- Operating Systems
- MPCS 51083-1 -- Cloud Computing
- MPCS 51220-1 -- Applied Software Engineering