SIMD+techniques

toc
 * Home * Board Representation * Bitboards * Sliding Piece Attacks * SIMD techniques**

x86 MMX- or SSE2 SIMD-instruction sets provide a **Packed Move Mask Byte**, pmovmskb-instruction, available in C/C++ as //_mm_movemask_epi// intrinsic, which moves the most-significant bits of each byte of a MMX- or XMM-register to the lowest 8 or 16 bits of a general purpose register. Thus, this instruction may be used to map file- or diagonal occupancies to consecutive bits.

=Bishop Attacks= For diagonals one may mask and compare byte-wise to get the occupancy to the sign-bits. With SSE2 and 128-bit XMM-registers one may process both diagonal- and anti-diagonal-occupancies in one run: code format="cpp" u64 fillRightAttacks[8][64]; // [file][occupiedIndex] __m128i xmmBmask[64]; // antidiagonal::diagonal - masks

U64 bishopAttacksSSE2(U64 occ, unsigned int sq) { __m128 mocc; mocc = _mm_cvtsi64x_si128(occ);           // gp to xmm, 0:occ mocc = _mm_unpacklo_epi64(mocc, mocc);    // occupancy to both xmm-halfs, occ:occ mocc = _mm_and_si128 (mocc, xmmBmask[sq]); // mask diagonal and antidiagonal mocc = _mm_cmpeq_epi8(mocc, xmmBmask[sq]); // cmp bytewise equal, FF if set, 00 otherwise unsigned int o = _mm_movemask_epi(mocc);  // get the 16 sign bits return (xmmBmask[sq].m128i_u64[0] & fillRightAttacks[sq>>3][(o>>1)&63]) | (xmmBmask[sq].m128i_u64[1] & fillRightAttacks[sq>>3][(o>>9)&63]); } code This sample code uses a shared 4KByte fill right lookup similar to fillUpAttacks of kindergarten bitboards. Of course one may use distinct lookup tables similar to rotated bitboards indexed by square and occupied-state without the trailing mask ands.

=See also=
 * SSSE3 Version of Hyperbola Quintessence
 * Fill right with SSE2-instructions
 * SSE2-Wrapper in C++

=Forum Posts=
 * [|Re: Kindergarten bitboards without multiplying] by Wylie Garvin, CCC, August 08, 2009

=External Links=
 * [|SIMD from Wikipedia]
 * [|MSDN _mm_movemask_epi]

=References= =What links here?= include page="SIMD techniques" component="backlinks" limit="20"
 * Up one Level**