Older Version Newer Version

GerdIsenberg GerdIsenberg Jun 26, 2013

[[toc]]
**[[Home]] * [[Board Representation]] * [[Bitboards]] * [[Sliding Piece Attacks]] * Congruent Modulo Bitboards**

**Congruent Modulo Bitboards** was introduced by [[Trevor Fenner]] and [[Mark Levene]] in the [[ICGA Journal#31_1|ICGA Journal, Vol. 31, No. 1]] in 2008 <ref>[[Trevor Fenner]], [[Mark Levene]] (**2008**). //Move Generation with Perfect Hashing Functions.// [[ICGA Journal#31_1|ICGA Journal, Vol. 31, No. 1]], pp. 3-12. [[http://www.dcs.bbk.ac.uk/~mark/download/bitboard_sliding_icga_final.pdf|pdf]]</ref>. While their [[Hash Table|Perfect Hashing]] approach provides great mathematical insights in [[http://en.wikipedia.org/wiki/Modulo|Congruent Modulo]] arithmetic, their final conclusion in comparison with [[Hashing Dictionaries]], [[Rotated Bitboards]] and [[Magic Bitboards]] was criticized by the obvious comparison with [[Kindergarten Bitboards]] <ref>[[http://www.talkchess.com/forum/viewtopic.php?t=20913|Nice Math - Strange Conclusions]] by [[Gerd Isenberg]], [[CCC]], April 29, 2008</ref>.

=Modulo vs. Multiplication= 
[[BitScan]] broaches the issue of [[Hash Table|Perfect Hashing]] with [[General Setwise Operations#Modulo|Modulo]] versus [[General Setwise Operations#Multiplication|Multiplication]] as well:
* [[BitScan#BitscanByModulo|Bitscan by Modulo]]
* [[BitScan#DeBruijnMultiplation|De Bruijn Multiplication]]

So does the [[Population Count#SWARPopcount|SWAR-Popcount]], when it is about to finally add byte-wise populations:
* [[Population Count#Castingout|Casting out]]
* [[Population Count#Multiplication|Multiplication]]

==Modulo==
===Congruence relation=== 
Fenner and Levene use masked lines (not necessarily excluding the sliding piece), that is bitboards with N=8 active bits with k={7,8,9} bits apart, starting with bit zero
> [[math]]
A_{kN} = \{0, k, 2k, ...,(N-1)k\}
[[math]]

Based on [[http://en.wikipedia.org/wiki/Congruence_relation|Congruence relation]] 
> [[math]]
b \equiv c \pmod{m}
[[math]]
or equivalently
> [[math]]
b \mod{m} = c \mod{m}
[[math]]

they deduced two general perfect hashing functions. The case N <= k with
> [[math]]
h_{1}(a) = a \mod (2^k + 2)
[[math]]

and the case N <= k + 1
> [[math]]
h_{2}(a) = a \mod (2^{k+1} + 1)
[[math]]

This results in modulo 514 for [[Diagonals|diagonals]], modulo 257 for [[Anti-Diagonals|anti-diagonals]], and modulo 258 for [[Files|files]], to calculate the occupied index. Tables could made denser by storing indices, but that would require a second indirection. While Fenner and Levene used a [[http://en.wikipedia.org/wiki/MATLAB|Matlab]] 32-bit implementation to conclude their approach might be competitive, this is how it may be implemented in [[C]] by looking up pre-calculated attack-bitboards:
[[code]]
U64 arrCmodDiaAttacks [514][64];  // 257 K
U64 arrCmodAntiAttacks[257][64];
U64 arrCmodFileAttacks[258][64];

U64 diagonalAttacks(U64 occ, enumSquare sq) {
   const U64 aDia = C64(0x8040201008040201);
   occ = ( (occ >> diashift[sq]) & aDia) % 514;
   return arrCmodDiaAttacks[occ][sq];
}

U64 antiDiagAttacks(U64 occ, enumSquare sq) {
   const U64 aAntiDiaShr7 = C64(0x0002040810204081);
   occ = ( (occ >> antishift[sq]) & aAntiDiaShr7 ) % 257;
   return arrCmodAntiAttacks[occ][sq];
}

U64 fileAttacks(U64 occ, enumSquare sq) {
   const U64 aFile = C64(0x0101010101010101);
   occ = ( (occ >> (sq&7)) & aFile) % 258;
   return arrCmodFileAttacks[occ][sq];
}
[[code]]
[[#Castingout255]]
===Casting out 255===
For ranks, diagonals or anti-diagonals, where the occupancy mask excludes the sliding piece, and the rank-or byte-wise sum of disjoint bits is therefor less than 255, [[Population Count#Castingout|Casting out]] 256-1 works as well, without any shifts required, and with more space saving options for the lookup table, i. e. similar to Kindergarten Bitboards with shared multiples of first rank attacks and an trailing post-mask with the same line <ref>[[http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=51996|Low memory usage attack bitboard generation]] by crystalclear, [[Computer Chess Forums|Winboard Forum]], October 06, 2011</ref>.
[[code]]
masked occupany  %  256-1            =  A-H    
. . . . . . . H     . . . . . . . .     . . . . . . . .
. . . . . . G .     . . . . . . . .     . . . . . . . . 
. . . . . F . .     . . . . . . . .     . . . . . . . . 
. . . . E . . .     . . . . . . . .     . . . . . . . . 
. . . . . . . .  %  . . . . . . . .  =  . . . . . . . . 
. . C . . . . .     . . . . . . . .     . . . . . . . . 
. B . . . . . .     . . . . . . . .     . . . . . . . . 
A . . . . . . .     1 1 1 1 1 1 1 1     A B C . E F G H 
[[code]]
===Reciprocal Multiplication===
The 64-bit modulo by a constant can be done most efficiently by [[General Setwise Operations#Modulo|reciprocal fixed point multiplication]], this is how [[Microsoft]] [[http://en.wikipedia.org/wiki/Visual_C%2B%2B|Visual C++]] 2005 compiler implements the mod constant for [[x86-64]] processors. One 64*64=128 bit multiplication, one shift, one further 32-bit multiplication , one subtraction. Of course using 64-bit [[General Setwise Operations#Division|division]] to get the remainder burns even more cycles.
[[code]]
Code:
% 514
 mov    r11d, r10 ; masked diagonal
 mov    rax, ff00ff00ff00ff01H
 mul    r10
 shr    rdx, 9
 imul   edx, 514 ; 00000202H
 sub    r11d, edx

% 257
 mov    r11d, r10 ; masked diagonal
 mov    rax, ff00ff00ff00ff01H
 mul    r10
 shr    rdx, 8
 imul   edx, 257 ; 00000101H
 sub    r11d, edx
[[code]]
==Multiplication== 
A Kindergarten like approach might look like this (not considering [[First Rank Attacks#TheOuterSquares|inner six bits]]):
[[code]]
U64 arrDiagonalAttacks[256][64];  // 128 K

U64 diagonalAttacks(U64 occ, enumSquare sq) {
   occ = (diagonalMask[sq] & occ) * C64(0x0101010101010101) >> 56;
   return arrDiagonalAttacks[occ][sq];
}
[[code]]
and uses one 64*64=64-bit multiplication, with this [[x86-64]] assembly for calculating an eight-bit occupied index:
[[code]]
 mov    rax, 0101010101010101H
 imul   rdx, rax
 shr    rdx, 56
[[code]]
Even [[Kindergarten Bitboards#FileAttacks|Kindergarten File-Attacks]] are cheaper and faster, not to mention [[Magic Bitboards]], which covers two lines of a rook or bishop in one run.

=Fenner's and Levene's conclusion= 
Quote from their paper pp 11
**{{3.5. Comparison with other Methods}}**
> {{As reported in Hyatt (2007), the rotated and magic bitboard methods are of comparable performance, and Tannous (2007) claims just a small improvement of the direct lookup method over rotated bitboards. It is easy to see that, in terms of the number of computer operations, the efficiency of our method will be similar to that of direct lookup. Thus we are justified in claiming that the computational efficiency of our method is comparable to the others.}}

Their conclusion was based on following statement of [[Robert Hyatt]] <ref>[[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=140141&t=16002|Re: BitBoard Tests Magic v Non-Rotated 32 Bits v 64 Bits]] by [[Robert Hyatt]], [[CCC]] August 25, 2007</ref> **...**
> {{I had reported this earlier. Magic was no faster than rotated. I switched because of two things...}}

> {{1. Magic is simpler, and simpler is better as I get older.}}
> {{2. Magic gives you the opportunity to update the occupied_squares and then generate moves easily.}}

> {{To do this with rotated bitboards first requires that all rotated bitboards be updated in addition to the normal occupied_squares bitboard. This is faster, if you use the feature (I don't yet, but well might at times).}}

**...** and this claim of [[Sam Tannous]] <ref>[[Sam Tannous]] (**2007**). //Avoiding Rotated Bitboards with Direct Lookup//. [[ICGA Journal#30_2|ICGA Journal, Vol. 30, No. 2]], pp. 85-91, [[http://arxiv.org/PS_cache/arxiv/pdf/0704/0704.3773v2.pdf|pdf]]</ref>:
> {{The results shown indicate that directly looking up the attacking moves for sliding pieces in hash tables improves the move generation speeds from 10% to 15% depending on computer architecture. Further efficiencies can be expected in a full implementation where the overhead of maintaining rotated bitboards is eliminated. The implementation and test code is made available in an Open-Source, interactive, chess programming module called “Shatranj” (Tannous, 2006).}}

=Publications=
* [[http://sun.aei.polsl.pl/~zjc/|Zbigniew J. Czech]], [[http://itee.uq.edu.au/~havas/|George Havas]], [[http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/Majewski:Bohdan_S=.html|Bohdan S. Majewski]] (**1997**). //Perfect Hashing//. Theoretical Computer Science, Vol. 182, Nos. 1-2, pp. 1-143
* [[Trevor Fenner]], [[Mark Levene]] (**2008**). //Move Generation with Perfect Hashing Functions.// [[ICGA Journal#31_1|ICGA Journal, Vol. 31, No. 1]], pp. 3-12. [[http://www.dcs.bbk.ac.uk/~mark/download/bitboard_sliding_icga_final.pdf|pdf]]

=Forum Posts=
* [[http://www.talkchess.com/forum/viewtopic.php?t=20913|Nice Math - Strange Conclusions]] by [[Gerd Isenberg]], [[CCC]], April 29, 2008
* [[http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=51996|Low memory usage attack bitboard generation]] by crystalclear, [[Computer Chess Forums|Winboard Forum]], October 06, 2011

=External Links=
* [[http://en.wikipedia.org/wiki/Congruence_relation|Congruence relation from Wikipedia]]
* [[http://en.wikipedia.org/wiki/Linear_congruence_theorem|Linear congruence theorem from Wikipedia]]
* [[http://en.wikipedia.org/wiki/Modular_arithmetic|Modular arithmetic from Wikipedia]]
* [[http://en.wikipedia.org/wiki/Modulo_operation|Modulo operation from Wikipedia]]

=References= 
<references />
=What links here?= 
[[include page="Congruent Modulo Bitboards" component="backlinks" limit="10" ]]
**[[Sliding Piece Attacks|Up one Level]]**