MPCS 51087 High Performance Computing (Winter 2023)

Section 1
Instructor(s) Siegel, Andrew (siegela)
Location JCL 011
Meeting Times Monday 5:30pm - 7:30pm
Fulfills Elective Specialization - High Performance Computing (HPC-1)


Course Description
Parallel programming is ubiquitous in both the largest compute clusters and the smallest, low-power embedded devices.  Though this has been the status quo for many years, achieving optimal parallel performance can still be a challenging, multi-disciplinary effort.  

In this course, we will focus on compute-intensive (rather than data-intensive) parallel programming, representative of numerical applications.  Computer architecture and systems will be a pervasive theme, and we will discuss how parallel APIs map to the underlying hardware.

We will implement and optimize C/C++ applications on large-scale, multicore CPU and GPU compute clusters.  We learn widely-used parallel programming APIs (OpenMP, CUDA, and MPI) and use them to solve problems in linear algebra, Monte Carlo simulations, discretized partial differential equations, and machine learning.

The majority of coding assignments can be completed in either C or C++.  Certain applications will require coding portions in pure C; however, in these cases, we will cover the requisite information for those with previous exposure to only C++.  Previous or concurrent courses in systems and architecture can be helpful, but no prerequisite knowledge of systems/architectures is assumed.  



  • Overview of CPU and GPU Architectures
    • Instruction sets
    • Functional units
    • Memory hierarchies
  • Performance Metrics
    • Latency and bandwidth
    • Roofline modeling
  • Single-core optimization
    • Compiler-assisted vectorization (data-level parallelism)
    • Design patterns for cache-based optimization
  • Multi-threaded CPU programming
    • Worksharing, synchronization, and atomic operations
    • Memory access patterns, including non-uniform memory access
    • The OpenMP API
  • GPU programming
    • Thread-mapping for optimal vectorization and memory access
    • Task-scheduling for latency reduction
    • The CUDA and OpenMP offload APIs
  • Distributed parallelism
    • Synchronous and asynchronous communication patterns
    • Data decomposition
    • Hybrid models for distributed multi-threaded and GPU programming
    • The MPI API


Throughout the course, will draw on examples from linear algebra, Monte Carlo simulations, discretized partial differential equations, and machine learning. 




The graded coursework will consist of 6 out-of-class, individually-completed coding projects.  Most will be one week, but the final assignments will be larger two-week projects.  


There will also be brief conceptual quizzes, which will be discussed in class and counted as a completion grade.  





We will draw on material from the following texts.  None are required, but they can be helpful resources throughout your career. 

Course Prerequisites

MPCS 51040 - C Programming or MPCS 51100 Advanced Programming or instructor consent.

Other Prerequisites

Familiarity with C or C++.

This course requires competency in Unix and Linux. If you attended the MPCS Unix Bootcamp you covered the required material. If you did not, please review the UChicago CS Student Resource Guide here:

Overlapping Classes

This class is scheduled at a time that conflicts with these other classes:

  • MPCS 51410-1 -- Object Oriented Programming
  • MPCS 56511-1 -- Introduction to Computer Security
  • MPCS 51030-1 -- iOS Application Development
  • MPCS 51200-2 -- Introduction to Software Engineering

Eligible Programs

MS in Computational Analysis in Public Policy (Year 2) MS in Molecular Engineering MA in Computational Social Science (Year 2) Bx/MS in Computer Science (Option 1: Research-Oriented) Bx/MS in Computer Science (Option 2: Professionally-oriented - CS Majors) Bx/MS in Computer Science (Option 3: Profesionally-oriented - Non-CS Majors) Masters Program in Computer Science