MPCS 51087 High Performance Computing (Winter 2023)

Section 1
Instructor(s) Siegel, Andrew (siegela)
Location None
Meeting Times
Fulfills Elective Specialization - High Performance Computing (HPC-1)

Syllabus

*Please note: This is the syllabus from the 2021/22 academic year and subject to change.*

Course Description
Parallel programming is ubiquitous in both the largest compute clusters and the smallest, low-power embedded devices.  Though this has been the status quo for many years, achieving optimal parallel performance can still be a challenging, multi-disciplinary effort.  

In this course, we will focus on compute-intensive (rather than data-intensive) parallel programming, representative of numerical applications.  Computer architecture and systems will be a pervasive theme, and we will discuss how parallel APIs map to the underlying hardware.

We will implement and optimize C/C++ applications on large-scale, multicore CPU and GPU compute clusters.  We learn widely-used parallel programming APIs (OpenMP, CUDA, and MPI) and use them to solve problems in linear algebra, Monte Carlo simulations, discretized partial differential equations, and machine learning.

The majority of coding assignments can be completed in either C or C++.  Certain applications will require coding portions in pure C; however, in these cases, we will cover the requisite information for those with previous exposure to only C++.  Previous or concurrent courses in systems and architecture can be helpful, but no prerequisite knowledge of systems/architectures is assumed.  

Topics:

 

  • Overview of CPU and GPU Architectures
    • Instruction sets
    • Functional units
    • Memory hierarchies
  • Performance Metrics
    • Latency and bandwidth
    • Roofline modeling
  • Single-core optimization
    • Compiler-assisted vectorization (data-level parallelism)
    • Design patterns for cache-based optimization
  • Multi-threaded CPU programming
    • Worksharing, synchronization, and atomic operations
    • Memory access patterns, including non-uniform memory access
    • The OpenMP API
  • GPU programming
    • Thread-mapping for optimal vectorization and memory access
    • Task-scheduling for latency reduction
    • The CUDA and OpenMP offload APIs
  • Distributed parallelism
    • Synchronous and asynchronous communication patterns
    • Data decomposition
    • Hybrid models for distributed multi-threaded and GPU programming
    • The MPI API

 

Throughout the course, will draw on examples from linear algebra, Monte Carlo simulations, discretized partial differential equations, and machine learning. 

 

Coursework

 

The graded coursework will consist of 6 out-of-class, individually-completed coding projects.  Most will be one week, but the final assignments will be larger two-week projects.  

 

There will also be brief conceptual quizzes, which will be discussed in class and counted as a completion grade.  

 

 

Textbooks

 

We will draw on material from the following texts.  None are required, but they can be helpful resources throughout your career. 

Course Prerequisites

MPCS 51040 - C Programming or MPCS 51100 Advanced Programming or instructor consent.

Other Prerequisites

Familiarity with C or C++.

This course requires competency in Unix and Linux. Please plan to attend the MPCS Unix Bootcamp (https://masters.cs.uchicago.edu/page/mpcs-unix-bootcamp) or take the online MPCS Unix Bootcamp Course on Canvas.

Overlapping Classes

This class is scheduled at a time that does not conflict with any other classes this quarter.

Eligible Programs

Masters Program in Computer Science MS in Computational Analysis in Public Policy (Year 2) MA in Computational Social Science (Year 2) Bx/MS in Computer Science (Option 1: Research-Oriented) Bx/MS in Computer Science (Option 2: Professionally-oriented - CS Majors) Bx/MS in Computer Science (Option 3: Profesionally-oriented - Non-CS Majors) MS in Molecular Engineering