Magic+Bitboards

a multiply-right-shift perfect hashing algorithm to index an attack bitboard database - which leaves both line-attacks of bishop or rook in one run. Thanks to the fast 64-bit multiplication and fast and huge [|caches] of recent processors, Magic Bitboards has become a [|de facto standard] of modern bitboard engines, as used for instance in Crafty, Arasan, Stockfish and Houdini. While Robert Hyatt reported no immediate speed gain over Rotated Bitboards in Crafty, Jon Dart mentioned a 20-25% speedup in Arasan over rotated. || toc =History= The magic bitboard approach was motivated by Gerd Isenberg's multi-direction hashing technique kindergarten bitboards and probably by Gerd's and Tony van Roon-Werten's early trials to map line-wise occupancies by De Bruijn- or random number multiplication. Lasse Hansen had the idea to hash the up to twelve relevant occupied bits of **both directions** of a rook- or bishop movement simultaneously.
 * Home * Board Representation * Bitboards * Sliding Piece Attacks * Magic Bitboards**
 * [[image:AuspiciousMoon.jpg width="305" height="245" link="http://www.puckergallery.com/artists/bak_major/bak_major_all.html"]] ||~  || **Magic bitboards**,
 * Samuel Bak - Auspicious Moon, 2001 ||~  ||^   ||

Pradu Kannan's improvements to Lasse Hansen's initial approach was to introduce a Java-like, two-dimensional array with individual size for each square and all it's relevant occupancies. Big savings in table-size - since many squares on either orthogonal or diagonal lines require less bits than others, especially considering the inner six bits. While center squares are more dense for rooks, it is the opposite for bishops.

Recently, Robert Purves coined the names Plain and Fancy Magics, and found Hansen's initial Plain Magics with 2 MiB table for rooks and 256 KiB for bishops nearly indistinguishable from Fancy (about 800 KiB and 38 KiB) on his [|Intel i5] with huge L3 smart cache, see plain versus fancy source code. In the same CCC forum thread, Robert Houdart proposed a byte lookup per square for further table reductions.

=How it works= A magic move-bitboard generation technique consists of four key steps:
 * 1) Mask the relevant occupancy bits to form a key. For example if you had a rook on a1, the relevant occupancy bits will be from a2-a7 and b1-g1.
 * 2) Multiply the key by a "magic number" to obtain an index mapping. This magic number can be generated by brute-force trial and error quite easily although it isn't 100% certain that the magic number is the best possible (see step 3).
 * 3) Right shift the index mapping by 64-n bits to create an index, where n is the number of bits in the index. A better magic number will have less bits required in the index.
 * 4) Use the index to reference a preinitialized move database.

The following illustration should give an impression, how magic bitboards work. All masked relevant occupied bits are perfectly hashed to the consecutive occupied state to index the pre-calculated attack-sets. Constructive collisions, where different occupancies map same attack-sets - since different bits are outer redundant bits "behind" the first blocker, are desired and even necessary to apply a perfect hashing with N bits. code any consecutive relevant occupancy                     combination of bishop b1, 5 bits                       the masked bits . . . . . . . .    . . . . . . . .     . . .[C D E F G] .. . . . . . .    . 1 . . . . . .     . . . . . . . . . . . . . . G. . 1 . . . . . .    . . . . . . . . . . . . . F. .    . 1 . . . . . .     . . . . . . . . . . . . E. . . *  . 1 . . . . . .  =  . . garbage. .   >> (64- 5) . . . D. . . .    . 1 . . . . . .     . . . . . . . . . . C. . . . .    . . . . . . . .     . . . . . . . . . . . . . . . .     . . . . . . . .     . . . . . . ..

any consecutive relevant occupancy                     combination of bishop d4, 9 bits                       the masked bits . . . . . . . .    . . . . . . . .     2 4 5 B C E F G] .. . . . . G. . . .some. . .    . . . . . . .[1 . 5 . . . F. .    . . . . . . . .     . . . . . . . . . . 4 . E. . .    . . .magic. . .    . . . . . . . . . . . . . . . .  *  . . . . . . . .  =  . . garbage. .   >> (64- 9) . . C. 2 . . .    . . .bits. . .    . . . . . . . . . B. . . 1 . .    . . . . . . . .     . . . . . . . . . . . . . . . .     . . . . . . . .     . . . . . . ..

any consecutive relevant occupancy                     combination of rook d4, 10 bits                        the masked bits . . . . . . . .    . . . . . . . .     4 5 6 B C E F G] .. . 6 . . . .    . . .some. . .    . . . . . .[1 2 . . . 5 . . . .     . . . . . . . .     . . . . . . . . . . . 4 . . . .     . . .magic. . .    . . . . . . . . . B C. E F G. * . . . . . . . .  =  . . garbage. .   >> (64-10) . . . 2 . . . .     . . .bits. . .    . . . . . . . . . . . 1 . . . .     . . . . . . . .     . . . . . . . . . . . . . . . .     . . . . . . . .     . . . . . . ..

any consecutive relevant occupancy                     combination of rook a1, 12 bits                        the masked bits . . . . . . . .    . . . . . . . .     5 6 B C D E F G] 6. . . . . . .    . . .some. . .    . . . .[1 2 3 4 5 . . . . . . .     . . . . . . . .     . . . . . . . . 4 . . . . . . .     . . .magic. . .    . . . . . . . . 3 . . . . . . .  *  . . . . . . . .  =  . . garbage. .   >> (64-12) 2 . . . . . . .     . . .bits. . .    . . . . . . . . 1 . . . . . . .     . . . . . . . .     . . . . . . . . . B C D E F G. . . . . . . . .    . . . . . . . . code The above illustration is correct for the b1 bishop, since it has only one ray and one bit per file and works kindergarten like. In general a one to one mapping of N scattered occupied bits to N consecutive bits is not always possible due to overflows. A perfect mapping of N scattered bits to N consecutive bits is likely not minimal for most squares. It requires one or two gaps inside the consecutive N bits, to avoid collisions, blowing up the table size.

But the purpose is to perfectly hash attack-sets rather than consecutive occupied bits. The number of distinct attack-sets is much smaller than the relevant occupancies. Thus, with the help of constructive collisions, some initial guess how to map the bits, and/or trial and error, using exactly N bits is always possible. If one tries hard enough to maximize constructive collisions - even less. There are some N-1 ones reported at Best Magics so far, which will half the individual table size for some squares in the widespread Fancy Magic Bitboards approach.

=Perfect Hashing= Magic bitboards applies perfect hashing. A [|surjective function], to map the vector of all relevant occupancies to a range of attack-sets per square. The less bits the attack-set - the closer the blockers, the more those attack-sets are shared by occupancies with different, but redundant outer squares.
 * The **cardinality** of all **relevant occupancies** is determined by the number of bits to map, varying from five to twelve - thus, the cardinality is the power of two the number of bits, varying from 32 to 4096.
 * The **cardinality** of **distinct attack-sets** is determined by the product of the length of each of the max four direction rays greater than zero (or one). The rook on d4 has 3*4*3*4 = 144 distinct attack-sets, a bishop on a8 has only 7.

The **ratio** of both cardinalities, that is all **relevant occupancies** versus the all **distinct attack-sets** is illustrated below: As a quarter of a board - due to the symmetry, the other squares may deduced by flipping and mirroring. Noticeable is the huge 4096/49 ratio of 2^12 occupied states versus 7 times 7 attack-sets of the edge rooks - 12 bits instead of 6. Those "expensive" squares make constructive collisions very desirable. To become more "minimal" by saving an index bit - to halve down the table for one square or the other.
 * ~  ||||||||~ bishop on square ||~ #occs/
 * 1) attset ||||||||~ rook on square ||
 * ~  ||~ A ||~ B ||~ C ||~ D ||~   ||~   ||~ A ||~ B ||~ C ||~ D ||
 * ~ 8 ||> 64/7 ||> 32/6 ||> 32/10 ||> 32/ 12 ||~  ||~ 8 ||> 4096/49 ||> 2048/42 ||> 2048/ 70 ||> 2048/ 84 ||
 * ~ 7 ||> 32/6 ||> 32/6 ||> 32/10 ||> 32/ 12 ||~  ||~ 7 ||> 2048/42 ||> 1024/36 ||> 1024/ 60 ||> 1024/ 72 ||
 * ~ 6 ||> 32/10 ||> 32/10 ||> 128/40 ||> 128/ 48 ||~  ||~ 6 ||> 2048/70 ||> 1024/60 ||> 1024/100 ||> 1024/120 ||
 * ~ 5 ||> 32/12 ||> 32/12 ||> 128/48 ||> 512/108 ||~  ||~ 5 ||> 2048/84 ||> 1024/72 ||> 1024/120 ||> 1024/144 ||
 * ~  ||||||||~ bishop on square ||~   ||||||||~ rook on square ||
 * ~  ||~ A ||~ B ||~ C ||~ D ||~   ||~   ||~ A ||~ B ||~ C ||~ D ||
 * ~ 8 ||> 9.14 ||> 5.33 ||> 3.20 ||> 2.67 ||~  ||~ 8 ||> 83.59 ||> 48.76 ||> 29.26 ||> 24.38 ||
 * ~ 7 ||> 5.33 ||> 5.33 ||> 3.20 ||> 2.67 ||~  ||~ 7 ||> 48.76 ||> 28.44 ||> 17.07 ||> 14.22 ||
 * ~ 6 ||> 3.20 ||> 3.20 ||> 3.20 ||> 2.67 ||~  ||~ 6 ||> 29.26 ||> 17.07 ||> 10.24 ||> 8.53 ||
 * ~ 5 ||> 2.67 ||> 2.67 ||> 2.67 ||> 4.74 ||~  ||~ 5 ||> 24.38 ||> 14.22 ||> 8.53 ||> 7.11 ||
 * ~ 5 ||> 2.67 ||> 2.67 ||> 2.67 ||> 4.74 ||~  ||~ 5 ||> 24.38 ||> 14.22 ||> 8.53 ||> 7.11 ||

The idea to implement minimal perfect hashing by an additional 16-bit indirection turned out to be slower (see conditional compiles in Pradu Kannan's source ).

Recent table sizes were about 38 KiB for the bishop attacks, but still about 800 KiB for rook attacks (Fancy). That sounds huge, considering L1 and L2 (L3) cache-sizes and number of cachelines and pages needed - we likely fetch distinct cachelines for each different square or occupancy. On the other hand caches and pages become larger in future processors. And occupancy and squares of the lookups don't change that randomly inside a search that we can still expect a lot of L1-hits. Unfortunately changes in occupancy outside the blockers and therefor not affecting the attack-set will introduce some more cache misses.

Wishing Dreams
Since there are 1428/4900 **distinct** attack sets for the bishop/rook attacks, we can use 11(2048 > 1428)/13(8196 > 4900) index bits for all bishop/rook attack bitboards. The table size will be 16KB for the bishop attacks and 64KB for rook attacks. To make it possible, we must find a group of magics (one per square and dependent each other). All the magics in the group must hash its all **relevant occupancies** into the same 11/13 bit index without any collision with others (anyhow for each square, different occupancies can hash to the same attack sets). code format="cpp" U64 bishopAttacks[2048]; // 16KB U64 rookAttacks[8196]; // 64KB

struct GMagic { U64 mask; // to mask relevant squares of both lines (no outer squares) and include sliding piece itself U64 magic; // magic 64-bit factor };

GMagic mBishopTbl[64]; //1KB GMagic mRookTbl[64]; //1KB

U64 bishopAttacks(U64 occ, enumSquare sq) { occ &= mBishopTbl[sq].mask; occ *= mBishopTbl[sq].magic; occ >>= 64-11; //fixed shift return bishopAttacks[occ]; //no offset }

U64 rookAttacks(U64 occ, enumSquare sq) { occ &= mRookTbl[sq].mask; occ *= mRookTbl[sq].magic; occ >>= 64-13; return rookAttacks[occ]; }

code But the idea seems like a wishing dream. Can we find ONE of the **magic group** of magics? Does it exist in theory? If not, can we use more bits to reach the condition? If yes, can we find a good magic group? A good magic group means that all the hash value <=n, so we can reduce the table size to n, until it reach to 1428/4900. Enough? If all hash values are grouped by square, can we think about using the Sharing Attacks method?

=Implementations= Despite its huge table size, register usage and code size are important issues as well - and here Magic Bitboards are unbeatable. There are enough variations of space-time tradeoff and implementation details of that theme for all who like to play the optimization game. C-source code with various precompiler options is available from Pradu Kannan's site. MINIMIZE_MAGIC is about Plain versus Fancy, while PERFECT_MAGIC_HASH enables an additional indirection via 16-bit indices. As always, with space-time tradeoffs - it depends on the individual cache- and [|memory footprint] inside a particular chess program and the hardware architecture, which solution is preferable. 

Fancy
Fancy Magic Bitboards is the main stream implementation as proposed by Pradu Kannan with individual table sizes for each square. It requires an individual shift, to leave that many index bits remaining as determined by the population of relevant occupancies. As mentioned, if one tries hard enough to maximize constructive collisions while looking for magics, one may half the size for several individual squares, see Best Magics so far. code format="cpp" U64 attack_table[...]; // ~840 KiB all rook and bishop attacks, less with constructive collisions optimization

struct SMagic { U64* ptr; // pointer to attack_table for each particular square U64 mask; // to mask relevant squares of both lines (no outer squares) U64 magic; // magic 64-bit factor int shift; // shift right };

SMagic mBishopTbl[64]; SMagic mRookTbl[64];

U64 bishopAttacks(U64 occ, enumSquare sq) { U64* aptr = mBishopTbl[sq].ptr; occ     &= mBishopTbl[sq].mask; occ     *= mBishopTbl[sq].magic; occ    >>= mBishopTbl[sq].shift; return aptr[occ]; }

U64 rookAttacks(U64 occ, enumSquare sq) { U64* aptr = mRookTbl[sq].ptr; occ     &= mRookTbl[sq].mask; occ     *= mRookTbl[sq].magic; occ    >>= mRookTbl[sq].shift; return aptr[occ]; } code 

Plain
Plain Magic Bitboards, Lasse Hansen's initial approach, takes much more memory for homogeneous two-dimensional attack arrays. 32 KiB per rook square, 4 KiB  for each bishop square. It does not need a variable shift and therefor needs one processor register less to further reduce register pressure, despite shorter code. Unfortunately, so far, there are two rook corner squares (0 and 7), which prohibit dividing the rook table size by two. code format="cpp" U64 mBishopAttacks[64] [512]; // 256 K U64 mRookAttacks [64][4096]; // 2048K

struct SMagic { U64 mask; // to mask relevant squares of both lines (no outer squares) U64 magic; // magic 64-bit factor };

SMagic mBishopTbl[64]; SMagic mRookTbl [64];

U64 bishopAttacks(U64 occ, enumSquare sq) { occ &= mBishopTbl[sq].mask; occ *= mBishopTbl[sq].magic; occ >>= 64-9; return mBishopAttacks[sq][occ]; } code 

Fixed shift Fancy
A fancy fixed shift variation was introduced by Onno Garms, looking for magics producing factors with appropriate reduced value ranges. That is, leave 12 bit rook indices for all squares, but with upper bit(s) clear for all possible occupancies for all the squares with less than 12 relevant bits, so that it efficiently becomes a less 12 bit index. Onno believes that it is possible to find magic factors with that property for most of those squares, to come close to the fancy table size with the advantage of a fixed shift. Even an "overlapping" of square arrays is feasible, if they contain accordant unused slot gaps on both ends. Volker Annuss came up with a fixed shift solution of only 800 KiB and below , and more recently even innovated with Black Magic Bitboards. code format="cpp" U64 attack_table[...]; // > 800 KiB - 1.x MiB  - depending on the effort on looking for magics

struct SMagic { U64* ptr; // pointer to attack_table for each particular square U64 mask; // to mask relevant squares of both lines (no outer squares) U64 magic; // magic 64-bit factor };

SMagic mBishopTbl[64]; SMagic mRookTbl[64];

U64 rookAttacks(U64 occ, enumSquare sq) { U64* aptr = mRookTbl[sq].ptr; occ     &= mRookTbl[sq].mask; occ     *= mRookTbl[sq].magic; occ    >>= 64-12; return aptr[occ]; } code 

Black Magic Bitboards
Volker Annuss' most recent innovation and further improvement of his already dense 700 KiB fixed shift fancy reference implementation is called **Black Magic Bitboards**, introduced in CCC in August 2017. Instead of intersection with a mask to clear the not relevant occupancy bits, he used the union with the complement of mask to set all not relevant occupancy bits.

code and mask[d4] vs.   or ~mask[d4] . . . . . . . .    1 1 1 1 1 1 1 1 . . . 1 . . . .     1 1 1 . 1 1 1 1 . . . 1 . . . .     1 1 1 . 1 1 1 1 . . . 1 . . . .     1 1 1 . 1 1 1 1 . 1 1 . 1 1 1 .     1 . . 1 . . . 1 . . . 1 . . . .     1 1 1 . 1 1 1 1 . . . 1 . . . .     1 1 1 . 1 1 1 1 . . . . . . . .     1 1 1 1 1 1 1 1 code

This trick, which basically adds a huge constant to the occupancy factor, enables to find **Black Magics** yielding in even smaller array sizes per square, determined by maximum and minimum index - the latter no longer zero for the empty occupancy! Unfortunately so far, the individual savings with denser square arrays don't completely advance to the size of the combined overlapping table, since overlapping became harder due to smaller gaps. However, 692 KiB for the complete rook and bishop attack table is the new fixed shift reference. Black Magic Bitboards are now used in Volker's engine Arminius : code format="cpp" U64 attack_table[88507]; // 692 KiB for published black magics by Volker Annuss (new size 88316 aka 690 KiB)

struct SBlackMagic { U64* ptr; // pointer to attack_table for each particular square U64 notmask; // to mask relevant squares of both lines (no outer squares) U64 blackmagic; // black magic 64-bit factor };

SBlackMagic mBishopTbl[64]; SBlackMagic mRookTbl[64];

U64 rookAttacks(U64 occ, enumSquare sq) { U64* aptr = mRookTbl[sq].ptr; occ     |= mRookTbl[sq].notmask; occ     *= mRookTbl[sq].blackmagic; occ    >>= 64-12; return aptr[occ]; } code 

Byte Lookup
Since there are only up to 144 different attack sets per square, Robert Houdart proposed to lookup bytes by square and hashed occupancy for a minimal perfect hashing with an additional indirection via an offset per square, which is successfully used in his engine Houdini. Following sample code uses the Plain fixed shift method, while it may also applied for the dense Fancy one. code format="cpp" byte mBishopAttacks[64] [512]; // 32 K byte mRookAttacks [64][4096]; // 256 K

U64 mRookLookup[4900]; // 39 K U64 mBishopLookup[1428];  // 11 K

struct SMagic { U64 mask; // to mask relevant squares of both lines (no outer squares) U64 magic; // magic 64-bit factor int offset;  // offset into lookup table };

SMagic mBishopTbl[64]; SMagic mRookTbl [64];

U64 bishopAttacks(U64 occ, enumSquare sq) { occ &= mBishopTbl[sq].mask; occ *= mBishopTbl[sq].magic; occ >>= 64-9; return mBishopLookup[mBishopTbl[sq].offset + mBishopAttack[sq][occ]]; } code 

32-bit Magics
Optimizations are possible for 32-bit systems, as proposed by Grant Osborne. Instead of letting the compiler split up a 64-bit multiply, two 32-bit multiplications can be manually implemented. Multiply the low bits of the magic with the low bits of the key and add it with the product of the high bits of the magic and the high bits of the key and use the upper bits of the sum to index the move database. For the bishops one may try only one 32-bit multiplication after xoring the masked high and low half. code format="cpp" struct { unsigned int rookMaskLo; unsigned int rookMaskHi; unsigned int rookMagicLo; unsigned int rookMagicHi; unsigned int rookShift; U64 *pRookAttacks; } sm[64];

U64 rookAttacks(U64 occ, enumSquare sq) { unsigned int lo = (int) occ; unsigned int hi = (int)(occ >> 32); lo &= sm[sq].rookMaskLo; hi &= sm[sq].rookMaskHi; lo *= sm[sq].rookMagicLo; hi *= sm[sq].rookMagicHi; lo = (lo ^ hi) >> sm[sq].rookShift; return sm[sq].pRookAttacks[lo]; } code 

Sharing Attacks
The development has not finished. Lasse Hansen came up with another stunning idea, to use a final intersection by attacks on an empty board, to share tables by two (rooks) or even four (bishops) squares. Bishop sharing is simple to understand, since there are light and dark colored bishops with disjoint attacks-sets, able to share the pre-calculated union of two square-attacks.

The trick is to share even equal colored bishops and rooks where both attack-sets are not disjoint - but all members of the intersection are always set, since they are direct neighbors of both sliders. This is how rooks and bishops may share union-attack-sets by two resp. four squares: code int rookSharing[64] = { 0, 1,  2,  3,  4,  5,  6,  7,   1,  0,  3,  2,  5,  4,  7,  6,   8,  9, 10, 11, 12, 13, 14, 15,   9,  8, 11, 10, 13, 12, 15, 14,  16, 17, 18, 19, 20, 21, 22, 23,  17, 16, 19, 18, 21, 20, 23, 22,  24, 25, 26, 27, 28, 29, 30, 31,  25, 24, 27, 26, 29, 28, 31, 30, };

int bishopSharing[64] = { 0, 2,  4,  4,  4,  4, 12, 14,   0,  2,  5,  5,  5,  5, 12, 14,   0,  2,  6,  6,  6,  6, 12, 14,   0,  2,  7,  7,  7,  7, 12, 14,   1,  3,  8,  8,  8,  8, 13, 15,   1,  3,  9,  9,  9,  9, 13, 15,   1,  3, 10, 10, 10, 10, 13, 15,   1,  3, 11, 11, 11, 11, 13, 15, }; code On the cost of one additional and-instruction, post-masking has the potential to almost halve the rook tables - and even better for the already small bishop tables. Additionally, four union attack bitboards can be shared by eight rook- and six bishop-squares. code R 1 1 1 1 1 1 1   1 R 1 1 1 1 1 1    1 1 R 1 1 1 1 1    1 1 1 R 1 1 1 1 B R 1 1 1 1 1 1   R B 1 1 1 1 1 1    1 1 B R 1 1 1 1    1 1 R B 1 1 1 1 1 1 1 . . . . .   1 1 1 . . . . .    . 1 1 1 1 . . .    . . 1 1 1 . . . 1 1 1 1 . . . 1    1 1 . 1 . . . .    1 . 1 1 1 1 . .    . 1 1 1 . 1 . . 1 1 . 1 1 . 1 .    1 1 . . 1 . . .    . . 1 1 . 1 1 .    1 . 1 1 . . 1 . 1 1 . . 1 B. .   1 1 . . . 1 . .    . . 1 1 . . 1 B. . 1 1 . . . 1 1 1 . . 1 1 1 .   1 1 . . . . 1 .    . . 1 1 . . 1 1    . . 1 1 . . . . 1 1 . 1 . . 1 1    1 1 . . . . . 1    . . 1 1 . 1 . .    . . 1 1 . . . . code If the additional post-mask becomes member of a structure, it exceeds 32 byte in 64-bit mode. A structure of arrays may be preferable than an array of structures. code format="cpp" U64 attack_table[...]; // ~400K Byte for all rook and (shared) bishop attacks

struct SMagic { U64* ptr;     // pointer to attack_table for each particular square U64 magic;    // magic 64-bit factor U64 premask;  // to mask relevant squares of both lines (no outer squares) U64 postmask; // attacks on the otherwise empty board int shift;    // shift right };

SMagic mBishopTbl[64]; SMagic mRookTbl[64];

U64 bishopAttacks(U64 occ, enumSquare sq) { U64* aptr = mBishopTbl[sq].ptr; occ     &= mBishopTbl[sq].premask; occ     *= mBishopTbl[sq].magic; occ    >>= mBishopTbl[sq].shift; return     mBishopTbl[sq].postmask & aptr[occ]; }

U64 rookAttacks(U64 occ, enumSquare sq) { U64* aptr = mRookTbl[sq].ptr; occ     &= mRookTbl[sq].premask; occ     *= mRookTbl[sq].magic; occ    >>= mRookTbl[sq].shift; return     mRookTbl[sq].postmask & aptr[occ]; } code 

Incorporating the Shift
Another possible improvement was introduced by Grant Osborne - to don't store the variable right shift inside an own array or structure element, but to use the almost redundant upper six bit of the magic factor instead - incorporating the shift into the magic number. While using a structure for pre-, postmask, magic and pointer, it's size is 4*8 = 32-byte or - if properly aligned - the half of one cache line. The additional effort for the right shift 58 to extract the variable shift is likely hidden by the latency of the independent multiplication, improving ipc. code format="cpp" U64 attack_table[...]; // ~400K Byte for all rook and (shared) bishop attacks

struct SMagic { U64* ptr;     // pointer to attack_table for each particular square U64 magic;    // magic 64-bit factor, including upper six bits as shift amount U64 premask;  // to mask relevant squares of both lines (no outer squares) U64 postmask; // attacks on the otherwise empty board }; // 32 Byte with 64-bit pointer!

SMagic mBishopTbl[64]; SMagic mRookTbl[64];

U64 bishopAttacks(U64 occ, enumSquare sq) { U64* aptr = mBishopTbl[sq].ptr; int shift = mBishopTbl[sq].magic >> 58; occ     &= mBishopTbl[sq].premask; occ     *= mBishopTbl[sq].magic; return     mBishopTbl[sq].postmask & aptr[occ >> shift]; }

U64 rookAttacks(U64 occ, enumSquare sq) { U64* aptr = mRookTbl[sq].ptr; int shift = mRookTbl[sq].magic >> 58; occ     &= mRookTbl[sq].premask; occ     *= mRookTbl[sq].magic; return     mRookTbl[sq].postmask & aptr[occ >> shift]; } code 

Incorporating Offset
A similar idea as in Incorporating the Shift in the domain fixed shift magics was proposed by Eugene Kotlov concerning the indirection to the stored attack array, using the 16 lower bits of the magic factor as offset.

Looking for Magics
The main issue is to find 64 magic factors for each rook and bishop square, to ensure they are free of collisions.
 * Looking for Magics

Magic Records
You may try to find your own magics - to possibly contribute to the record table for most dense fancy or plain tables. So far, there were no magics found for the expensive rook squares a1 and h1 (0, 7) with less than 12 bits, which would allow to half the plain table size. Is there any prove they don't exist - even with Black Magic?
 * Best Magics so far

=See also=
 * BMI2 - PEXT Bitboards
 * Kindergarten Bitboards

=Publications=
 * Pradu Kannan (**2007**). //Magic Move-Bitboard Generation in Computer Chess//, as [|pdf]
 * Fritz Reul (**2009**). //New Architectures in Computer Chess//, Ph.D. Thesis, Chapter 3, Magic Hash Functions for Bitboards, [|pdf]
 * Maurizio Monge (**2011**). //[|On perfect hashing of numbers with sparse digit representation via multiplication by a constant]//. [|Discrete Applied Mathematics, Vol. 159, No. 11]

=Forum Posts=

2006

 * [|Fast(er) bitboard move generator] by Lasse Hansen, Winboard Forum, June 14, 2006
 * [|List of magics for bitboard move generation] by Pradu Kannan, Winboard Forum, August 23, 2006
 * [|Fastest Magic Move Bitboard Generator ready to use] by Pradu Kannan, Winboard Forum, August 25, 2006
 * [|A Faster Magic Move Bitboard Generator?] by Grant Osborne, Winboard Forum, December 15, 2006

2007

 * [|SEE with magic bitboards] by Pradu Kannan, Winboard Forum, January 24, 2007 » Static Exchange Evaluation
 * [|Magic bitboards, Java] by Sargon, CCC, August 19, 2007 » Java
 * [|BitBoard Tests Magic v Non-Rotated 32 Bits v 64 Bits] by Brian Richardson, CCC, August 24, 2007
 * [|Magic and precomputation] by Onno Garms, Winboard Forum, September 23, 2007

2008

 * [|Magic Move Generation] by Colin, CCC, February 18, 2008
 * [|How to reduce the "bits" used in a magic number] by Charlie Brune, CCC, May 24, 2008
 * [|magic move - bitboard orientation] by Frank Phillips, CCC, June 01, 2008
 * [|Incorporating the shift into the magic number] by Grant Osborne, CCC, June 18, 2008

2009

 * [|Magics revisited] by Richard Pijl, CCC, January 04, 2009
 * [|Magic with fixed shift] by Onno Garms, Winboard Forum, March 18, 2009
 * [|Paradigm shifts] by Gian-Carlo Pascutto, CCC, November 14, 2009
 * [|32-bit Magic experiments] by Andrew Fan, Winboard Forum, December 03, 2009 » Kindergarten Bitboards

2010 ...

 * [|Plain and fancy magic on modern hardware] by Robert Purves, CCC, August 22, 2010
 * [|Fixed shift magics with 800KB lookup table] by Volker Annuss, Winboard Forum, September, 05, 2010
 * [|Build magics on the fly] by Marco Costalba, CCC, June 07, 2011
 * [|Magic Multiplier Fundamentals] by Cheney Nattress, CCC, January 03, 2013
 * [|Occupancy Variations] by CDaley11, OpenChess Forum, January 25, 2013
 * [|Magic bitboards: Cache issues and state of the art?] by Samuel Siltanen, CCC, March 22, 2013
 * [|Understanding Magic Bitboards] by Jaco van Niekerk, Winboard Forum, April 03, 2013

2015 ...
> [|Re: understanding fixed shift fancy magic bitboard generation] by Volker Annuss, CCC, May 06, 2016
 * [|question about magic keys generation time] by Fermin Serrano, CCC, March 19, 2015 » Looking for Magics
 * [|Fancy magic bitboards question] by Eric VanderHelm, CCC, September 22, 2015
 * [|M42 - A C++ library for Bitboard attack mask generation] by Syed Fahad, CCC, April 30, 2016
 * [|understanding fixed shift fancy magic bitboard generation] by Kalyankumar Ramaseshan, CCC, May 05, 2016
 * [|Why should magic bitboards be sparse?] by Alessandro Power, CCC, December 21, 2016
 * [|Looking for dense magics] by Lucas Braesch, CCC, July 11, 2017
 * [|Black magic bitboards] by Volker Annuss, CCC, August 03, 2017 » Black Magic Bitboards
 * [|Disproving the existence of some magics] by Niklas Fiekas, CCC, September 16, 2017 » Looking for Magics
 * [|magic number comprising offset] by Eugene Kotlov, CCC, February 07, 2018
 * [|No bishop magics with fixed shift 8] by Niklas Fiekas, CCC, April 09, 2018 » Looking for Magics

=External Links=
 * [|Buzz - A Winboard Chess Playing Program by Pradu Kannan] - Source of Pradu Kannan's magic Move Generator
 * [|Rival Chess Engine - Magic Bitboards] by Russell Newman and Chris Moreton » Rival
 * [|Chess Programming | Generating Magic Multipliers] by Steve Maughan » Looking for Magics
 * [|chess - Sliding move generation using magic bitboard] - [|Stack Overflow], June 4, 2013
 * [|M42] by Syed Fahad
 * [|GitHub - goutham/magic-bits: Magic-bitboards for Chess] by Goutham Bhat

=Other Magic Stuff= > line-up: Nguyên Lê, [|Claudio Puntin], [|Steffen Schorn], [|Kristjan Randalu], [|Bodek Janke] > media type="youtube" key="oUDnf2fSIRQ" width="560" > Youn Sun Nah, Ulf Wakenius, Lars Danielsson, Vincent Peirani > media type="youtube" key="bOvB6gUpsuk"
 * [|Magic from Wikipedia]
 * [|Magic (illusion) from Wikipedia]
 * [|Magic (paranormal) from Wikipedia]
 * [|Magic (cryptography) from Wikipedia]
 * [|Magic number from Wikipedia]
 * [|Magic constant from Wikipedia]
 * [|Magic square from Wikipedia]
 * [|Magic hypercube from Wikipedia]
 * Five Senses - Magic Constant, [|Christmas Jazz Festival], [|Tallinn], December 10, 2011, [|YouTube] Video
 * Youn Sun Nah - [|Momento Magico] (2013), [|YouTube] Video

=References= =What links here?= include page="Magic Bitboards" component="backlinks" limit="200"
 * Up one Level**