NUMA

a [|multiprocessing] memory design where the main memory is partitioned between processors. Opposed to SMP, where all processors compete for access to the centralized shared [|memory bus], making it difficult to scale well bejoind 8 to 12 CPUs, NUMA splits the main memory into so called nodes with separate memory busses for subsets of processors, and high speed interconnection between nodes, either directly in so called 1-hop distance, or indirectly in 2-hop distance. Despite the high speed interconnection, NUMA memory access time varies considerably between faster local memory and remote memory of other nodes. Maintaining [|cache coherence] of processor caches adds significant overhead to NUMA Systems, addressed by [|ccNUMA] which is mostly used synonymous for current NUMA implementations. || toc =x86= AMD implemented NUMA with its Opteron processor in 2003, using [|HyperTransport]. Intel announced NUMA compatibility for their x86 servers in late 2007 with [|Nehalem CPUs] using [|QuickPath Interconnect].
 * Home * Hardware * Memory * NUMA**
 * [[image:NUMA.svg.png link="https://commons.wikimedia.org/wiki/File:NUMA.svg"]] ||~ || **NUMA**, (Non-uniform memory access)
 * Possible NUMA system ||~ ||^ ||

=Considerations= Scheduling of threads across nodes and cores of a system is a complicated topic due to access of independent or shared data. There are several considerations in ccNUMA aware [|operating systems] and software, such as keeping data local by virtue of first touch. NUMA and [|processor affinity] [|APIs] help application programmers to bind threads or processes to NUMA nodes or to allocate memory from a certain node.

=See also=
 * Optimization
 * Parallel Search
 * SMP
 * Thread

=Selected Publications= > [|Memory part 1] > [|Memory part 2: CPU caches] > [|Memory part 3: Virtual Memory] > [|Memory part 5: What programmers can do]
 * [|Andi Kleen] (**2004**). //An NUMA API for Linux//. SUSE Labs, [|pdf]
 * [|Ulrich Drepper] (**2007**). //What Every Programmer Should Know About Memory//. [|pdf], also hosted by [|LWN.net]
 * [|Memory part 4: NUMA support]
 * [|Nakul Manchanda], [|Karan Anand] (**2010**). //Non-Uniform Memory Access (NUMA)//. [|New York University], [|pdf]
 * [|Stefan Lankes], [|Thomas Roehl], [|Christian Terboven], [|Thomas Bemmerl] (**2012**). //[|Node-Based Memory Management for Scalable NUMA Architectures]//. [|RWTH Aachen], [|ROSS 2012], [|slides as pdf]
 * [|Georg Hager], [|Jan Treibig], [|Gerhard Wellein] (**2013**). //The Practitioner's Cookbook for Good Parallel Performance on Multi- and Many-Core Systems//. [|RRZE], [|SC13], [|slides as pdf]
 * [|Rik van Riel], [|Vinod Chegu] (**2014**). //Automatic NUMA Balancing//. [|Red Hat Summit 2014], [|slides as pdf], [|video lecture] by Rik van Riel

=Forum Posts=

2000 ...

 * [|DTS NUMA] by Vincent Diepeveen, CCC, September 03, 2002 » Dynamic Tree Splitting
 * [|What's the difference between NUMA, SMP and MPI for chess?] by Joachim Rang, CCC, April 15, 2004 » SMP
 * [|Opteron NUMA/SMP question] by Matthew Hull, CCC, February 09, 2005

2010 ...

 * [|optimizing performance on dual Xeon systems (NUMA)] by Jon Dart, CCC, February 28, 2013
 * [|Smp concepts] by Michael Hoffmann, CCC, June 01, 2014 » SMP

2015 ...
> [|Re: thread affinity] by Robert Hyatt, CCC, July 03, 2015
 * [|NUMA-awareness] by Louis Zulli, CCC, February 25, 2015
 * [|thread affinity] by Martin Sedlak, CCC, July 03, 2015 » Thread
 * [|Actual speedups from YBWC and ABDADA on 8+ core machines?] by Tom Kerrigan, CCC, July 10, 2015 » Young Brothers Wait Concept, ABDADA
 * [|NUMA 101] by Robert Hyatt, CCC, January 07, 2016 » Crafty
 * [|NUMA in a YBWC implementation] by Edsel Apostol, CCC, July 20, 2016 » Young Brothers Wait Concept
 * [|lets get the ball moving down the field on numa awareness] by Mohammed Li, FishCooking, August 30, 2016
 * [|search thread memory allocation (NUMA)] by Ronald de Man, FishCooking, September 06, 2016
 * [|What do you do with NUMA?] by Matthew Lai, CCC, September 19, 2016
 * [|NUMA test compilation] by Joachim Müller, FishCooking, November 05, 2016 » Stockfish
 * [|What Linux compatible Numa aware engines are available?] by Dann Corbit, CCC, March 29, 2017 » Linux

=External Links=
 * [|Non-Uniform Memory Access (NUMA) from Wikipedia]
 * [|NUMA Frequently Asked Questions]
 * [|Multiprocessing - OSDev Wiki]
 * [|ccNUMA machines] in Aad J. van der Steen, [|Jack J. Dongarra] (**2004**). //[|Overview of Recent Supercomputers]//.

Linux

 * [|numa(7) - Linux manual page]
 * [|A NUMA API for Linux] (pdf, April 2015)

Windows

 * [|Allocating Memory from a NUMA Node, MSDN]
 * [|NUMA Support (Windows), MSDN]
 * [|Processor Groups (Windows), MSDN]

x86

 * [|Optimizing Applications for NUMA | Intel® Developer Zone]
 * [|Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ ccNUMA Multiprocessor Systems] (pdf)

=References= =What links here?= include page="NUMA" component="backlinks" limit="120"
 * Up one Level**