Where and When


  • Meeting ID: 214-104-523
  • Join via web browser: https://cuboulder.zoom.us/j/214104523
  • Join via Zoom app (using meeting ID)
  • Join via One tap mobile: +16699006833,,214104523# or +16465588656,,214104523#
  • Join via telephone: 1-669-900-6833 or 1-646-558-8656


Instructor: Jed Brown

  • GitHub: @jedbrown
  • Office hours: See calendar in ECOT 824 (usually Tue 14:30-15:30 and Thu 9:00-10:00)

Teaching Assistant: Camden Elliott-Williams

  • GitHub: @CamdenCU
  • Office hours: Wed 9:30-10:30 and 13:30-14:30 or by appointment in ECCR 1B “Systems Lab” (see map)


For each assignment, click the link below to accept via GitHub Classroom. This creates a private repository for you to work in. Then git clone the repository to whatever machine you’ll work on and follow instructions in the README. Usually you will be asked to read and edit code, run a range of experiments, and interpret/plot data in a Report.ipynb.

Assigned Due Description
2019-09-06 2019-09-16 (part by 2019-09-13) Experiments in vectorization
2019-09-20 2019-09-30 Sorting


Videos appear automatically on Canvas and linked below.

Date Topic
Aug 26 Course introduction and preview of architectural trends
Aug 28 Intro to architecture
Aug 30 Intro to vectorization and ILP
Sep 4 Intro to performance modeling (roofline)
Sep 6 Intro to parallel scaling
Sep 9 Joel Frahm on CU Research Computing (slides)
Sep 11 OpenMP Basics
Sep 13 OpenMP memory semantics, synchronization, and perf demo
Sep 16 OpenMP tasking and computational depth/critical path
Sep 18 Low-level optimization, parallel reductions and scans
Sep 20 Searching and sorting methods (based on parts of slides and slides)
Sep 23 Bitonic sort recap/demo; intro to graph independence
Sep 25 Recorded lecture: Introduction to MPI
Sep 27 Library interfaces with MPI: Conway’s Game of Life
Sep 30 Dense linear algebra and networks
Oct 2 Dense linear algebra and orthogonality
Oct 4 Orthogonality and conditioning
Oct 7 Parallel QR and Elemental for distributed memory
Oct 9 Sparse and iterative linear algebra
Oct 11 Intro to preconditioning
Oct 14 Intro to domain decomposition preconditioning
Oct 16 Domain decomposition preconditioning and scaling
Oct 18 Multilevel preconditioning and predictive modeling
Oct 21 Nonlinear solvers
Oct 23 Transient problems
Oct 25 libCEED: spectral elements and matrix-free methods
Oct 28 Coprocessor architectures
Oct 30 CUDA (by Camden Elliott-Williams)
Nov 1 Practical CUDA
Nov 4 ISPC, OpenMP target, OpenACC
Nov 6 HPC I/O
Nov 8 MPI-IO
Nov 11 Data-intensive workflows
Nov 13 Data and probability
Nov 15 Dynamic/interactive parallel runtimes
Nov 18 Intro to N-body simulation
Nov 20 Long-range evaluation in N-body simulation
Nov 22 Molecular dynamics
Dec 2 Git workflows
Dec 4 Fourier methods
Dec 6 Multigrid Intro
Dec 9 Algebraic Multigrid