Node-Level Performance Optimization

This course covers advanced topics on code optimization for x86 platforms (Intel and AMD CPUs). We discuss different techniques for analyzing and maximizing both single and multi-core performance within a single node. The topics inlude instruction-level parallelism, vectorization, and efficient utilization of cache and memory. The course consists of lectures and hands-on exercises.

Learning outcome

- Awareness of features and internal workings of x86 CPUs
- Ability to analyze and assess single-node performance
- Ability to vectorize computations
- Ability to optimize cache and memory access

Prerequisites

- Good knowledge of C/C++ or Fortran
- Good knowledge of threading using OpenMP
- Basic knowledge of modern CPU architectures

Agenda

Day 1
- Overview about performance engineering
- General overview of modern multicore CPU
- Main memory performance
- Performance analysis tools

Day 2
- Deeper dive into caches
- Detailed look into Intel and AMD CPUs
- Advanced vectorization
- Additional optimization topics

Deadline for registrations 3.5.2024

Event location

CSC Training Facilities

Keilaranta 14
02100 Espoo

View larger map and directions

Customer Training

Node-Level Performance Optimization

No registrations available anymore.

Event time

Event location

Organizer