Congruent+Modulo+Bitboards

toc
 * Home * Board Representation * Bitboards * Sliding Piece Attacks * Congruent Modulo Bitboards**


 * Congruent Modulo Bitboards** was introduced by Trevor Fenner and Mark Levene in the ICGA Journal, Vol. 31, No. 1 in 2008 . While their Perfect Hashing approach provides great mathematical insights in [|Congruent Modulo] arithmetic, their final conclusion in comparison with Hashing Dictionaries, Rotated Bitboards and Magic Bitboards was criticized by the obvious comparison with Kindergarten Bitboards.

=Modulo vs. Multiplication= BitScan broaches the issue of Perfect Hashing with Modulo versus Multiplication as well:
 * Bitscan by Modulo
 * De Bruijn Multiplication

So does the SWAR-Popcount, when it is about to finally add byte-wise populations:
 * Casting out
 * Multiplication

Congruence relation
Fenner and Levene use masked lines (not necessarily excluding the sliding piece), that is bitboards with N=8 active bits with k={7,8,9} bits apart, starting with bit zero > math A_{kN} = \{0, k, 2k, ...,(N-1)k\} math

Based on [|Congruence relation] > math b \equiv c \pmod{m} math or equivalently > math b \mod{m} = c \mod{m} math

they deduced two general perfect hashing functions. The case N <= k with > math h_{1}(a) = a \mod (2^k + 2) math

and the case N <= k + 1 > math h_{2}(a) = a \mod (2^{k+1} + 1) math

This results in modulo 514 for diagonals, modulo 257 for anti-diagonals, and modulo 258 for files, to calculate the occupied index. Tables could made denser by storing indices, but that would require a second indirection. While Fenner and Levene used a [|Matlab] 32-bit implementation to conclude their approach might be competitive, this is how it may be implemented in C by looking up pre-calculated attack-bitboards: code U64 arrCmodDiaAttacks [514][64]; // 257 K U64 arrCmodAntiAttacks[257][64]; U64 arrCmodFileAttacks[258][64];

U64 diagonalAttacks(U64 occ, enumSquare sq) { const U64 aDia = C64(0x8040201008040201); occ = ( (occ >> diashift[sq]) & aDia) % 514; return arrCmodDiaAttacks[occ][sq]; }

U64 antiDiagAttacks(U64 occ, enumSquare sq) { const U64 aAntiDiaShr7 = C64(0x0002040810204081); occ = ( (occ >> antishift[sq]) & aAntiDiaShr7 ) % 257; return arrCmodAntiAttacks[occ][sq]; }

U64 fileAttacks(U64 occ, enumSquare sq) { const U64 aFile = C64(0x0101010101010101); occ = ( (occ >> (sq&7)) & aFile) % 258; return arrCmodFileAttacks[occ][sq]; } code 

Casting out 255
For ranks, diagonals or anti-diagonals, where the occupancy mask excludes the sliding piece, and the rank-or byte-wise sum of disjoint bits is therefor less than 255, Casting out 256-1 works as well, without any shifts required, and with more space saving options for the lookup table, i. e. similar to Kindergarten Bitboards with shared multiples of first rank attacks and an trailing post-mask with the same line. code masked occupany %  256-1            =  A-H . . . . . . . H. . . . . . . .    . . . . . . . . . . . . . . G. . . . . . . . .    . . . . . . . . . . . . . F. .    . . . . . . . .     . . . . . . . . . . . . E. . .    . . . . . . . .     . . . . . . . . . . . . . . . .  %  . . . . . . . .  =  . . . . . . . . . . C. . . . .    . . . . . . . .     . . . . . . . . . B. . . . . .    . . . . . . . .     . . . . . . . . A. . . . . . .    1 1 1 1 1 1 1 1     A B C. E F G H code

Reciprocal Multiplication
The 64-bit modulo by a constant can be done most efficiently by reciprocal fixed point multiplication, this is how Microsoft [|Visual C++] 2005 compiler implements the mod constant for x86-64 processors. One 64*64=128 bit multiplication, one shift, one further 32-bit multiplication, one subtraction. Of course using 64-bit division to get the remainder burns even more cycles. code Code: % 514 mov   r11d, r10 ; masked diagonal mov   rax, ff00ff00ff00ff01H mul   r10 shr   rdx, 9 imul  edx, 514 ; 00000202H sub   r11d, edx

% 257 mov   r11d, r10 ; masked diagonal mov   rax, ff00ff00ff00ff01H mul   r10 shr   rdx, 8 imul  edx, 257 ; 00000101H sub   r11d, edx code

Multiplication
A Kindergarten like approach might look like this (not considering inner six bits): code U64 arrDiagonalAttacks[256][64]; // 128 K

U64 diagonalAttacks(U64 occ, enumSquare sq) { occ = (diagonalMask[sq] & occ) * C64(0x0101010101010101) >> 56; return arrDiagonalAttacks[occ][sq]; } code and uses one 64*64=64-bit multiplication, with this x86-64 assembly for calculating an eight-bit occupied index: code mov   rax, 0101010101010101H imul  rdx, rax shr   rdx, 56 code Even Kindergarten File-Attacks are cheaper and faster, not to mention Magic Bitboards, which covers two lines of a rook or bishop in one run.

=Fenner's and Levene's conclusion= Quote from their paper pp 11 >

Their conclusion was based on following statement of Robert Hyatt **...** >

> >

>

>
 * ...** and this claim of Sam Tannous :

=Publications=
 * [|Zbigniew J. Czech], [|George Havas], [|Bohdan S. Majewski] (**1997**). //Perfect Hashing//. Theoretical Computer Science, Vol. 182, Nos. 1-2, pp. 1-143
 * Trevor Fenner, Mark Levene (**2008**). //Move Generation with Perfect Hashing Functions.// ICGA Journal, Vol. 31, No. 1, pp. 3-12. [|pdf]

=Forum Posts=
 * [|Nice Math - Strange Conclusions] by Gerd Isenberg, CCC, April 29, 2008
 * [|Low memory usage attack bitboard generation] by crystalclear, Winboard Forum, October 06, 2011

=External Links=
 * [|Congruence relation from Wikipedia]
 * [|Linear congruence theorem from Wikipedia]
 * [|Modular arithmetic from Wikipedia]
 * [|Modulo operation from Wikipedia]

=References= =What links here?= include page="Congruent Modulo Bitboards" component="backlinks" limit="10"
 * Up one Level**