Zobrist+Hashing

a technique to transform a board position of arbitrary size into a number of a set length, with an equal distribution over all possible numbers, invented by Albert Zobrist. In an early Usenet post in 1982, Tom Truscott mentioned Jim Gillogly's n-bit hashing technique, who apparently read Zobrist's paper early, and credits Zobrist in a 1997 rgcc post. Zobrist Hashing is an instance of [|tabulation hashing], a method for constructing [|universal families of hash functions] by combining [|table lookup] with exclusive or operations. Zobrist Hashing was rediscovered by J. Lawrence Carter and Mark N. Wegman in 1977 and studied in more detail by Mihai Pătrașcu and Mikkel Thorup in 2011.
 * Home * Search * Transposition Table * Zobrist Hashing**
 * [[image:King_Wen_(I_Ching).svg.png width="280" link="https://en.wikipedia.org/wiki/John_Cage#Chance"]] ||~  || **Zobrist Hashing**,

The main purpose of Zobrist hash codes in chess programming is to get an almost unique index number for any chess position, with a very important requirement that two similar positions generate entirely different indices. These index numbers are used for faster and more space efficient Hash tables or databases, e.g. transposition tables and opening books. || toc =Metamorphosis= ||
 * [|King Wen sequence]  ||~   ||^   ||
 * [[image:Metamorphosis_II_1.jpg width="640" link="http://www.mcescher.com/Gallery/gallery-recogn.htm"]]
 * M. C. Escher, [|Metamorphosis] III, 1967-1968 ||

=Initialization= At program initialization, we generate an array of pseudorandom numbers :
 * One number for each piece at each square
 * One number to indicate the side to move is black
 * Four numbers to indicate the castling rights, though usually 16 (2^4) are used for speed
 * Eight numbers to indicate the file of a valid En passant square, if any

This leaves us with an array with 781 (12*64 + 1 + 4 + 8) random numbers. Since pawns don't happen on first and eighth rank, one might be fine with 12*64 though. There are even proposals and implementations to use overlapping keys from unaligned access up to an array of only 12 numbers for every piece and to rotate that number by square.

Programs usually implement their own Pseudorandom number generator (PRNG), both for better quality random numbers than standard library functions, and also for reproducibility. This means that whatever platform the program is run on, it will use the exact same set of Zobrist keys. This is also useful for things like opening books, where the positions in the book can be stored by hash key and be used portably across machines, considering endianness.

=Runtime= If we now want to get the Zobrist hash code of a certain position, we initialize the hash key by xoring all random numbers linked to the given feature. code E.g the starting position: [Hash for White Rook on a1] xor [White Knight on b1] xor [White Bishop on c1] xor ... ( all pieces ) ... xor [White castling short] xor [White castling long] xor ... ( all castling rights ) code

The fact that xor-operation is [|own inverse] and can be undone by using the same xor-operation again, is often used by chess engines. It allows a fast incremental update of the hash key during make or unmake moves.

E.g., for a White Knight that jumps from b1 to c3 capturing a Black Bishop, these operations are performed: code [Original Hash of position] xor [Hash for White Knight on b1] ... ( removing the knight from b1 ) ... xor [Hash for Black Bishop on c3] ( removing the captured bishop from c3 ) ... xor [Hash for White Knight on c3] ( placing the knight on the new square ) ... xor [Hash for Black to move] ( change sides) code

=Collisions= Key collisions or type-1 errors are inherent in using Zobrist keys with far less bits than required to encode all reachable chess positions.

Theory
An important issue is the question of what size the hash keys should have. Smaller hash keys are faster and more space efficient, while larger ones reduce the risk of a hash collision. A collision occurs if two positions map the same key. The dangers of which were well assessed by Robert Hyatt and Anthony Cozzie in their paper //Hash Collisions Effect//. Usually 64bit are used as a standard size in modern chess programs.

Hash collisions demonstrate the [|birthday "paradox"], which is to say the chance of collisions approaches certainty at around the **square root** of the number of possible keys, contrary to some people's expectations. You can expect to encounter a collision in a 32 bit hash when you have evaluated sqrt(2 ^ 32) == 2 ^ 16 or around 65 thousand positions. With a 64 bit hash, you can expect a collision after about 2 ^ 32 or 4 billion positions.

Praxis
Post by Jonathan Schaeffer :

Lack a True Integer Type
Some languages (such as JavaScript and [|Lua]) only have a 64-bit floating point "Number" type. In JavaScript, this type breaks down into a 32 bit integer when bitwise operators are used. One way to get a 64 bit hash is to use two 32 bit numbers in parallel, as Garbochess-JS does. Another, which p4wn used at one stage, is to use 47 or 48 bit **additive** hashes. 64 bit floating point numbers are true integers up to 53 bits, so it is possible to sum at least 32 (and on average close to 64) random 48 bit numbers, which was enough for p4wn's purposes. For additive Zobrist hashing, you add the number when placing a piece and subtract it when removing it, rather than using xor both ways. There is no difference in accuracy or speed, and 48 bit hashes give you collisions at around the 2 ^ 24 or 16 million point.

Linear Independence
The minimum and average Hamming Distance over all Zobrist keys was often considered as "quality"-measure of the keys. However, maximizing the minimal hamming distance leads to very poor Zobrist keys. As long the minimum hamming distance is greater zero, [|linear independence] (that is a small subset of all keys doesn't xor to zero), is much more important than hamming distance as explained by Sven Reichard :



code x1^x2^...^xm = y1^y2^...^yn for codes xi, yi and small number m and n, and xi not equal to yj code

code x1 + x2 + ... + xm = y1 + y2 + ... + yn code

code x1 + x2 + ... + xm + y1 + y2 + ... + yn = 0 code

code (1 1) (0 1) code

code 11110000 11001100 00111100 code

||

=See also=
 * CPW-Engine_transposition
 * BCH Hashing

=Publications=
 * Albert Zobrist (**1970**). //A New Hashing Method with Application for Game Playing//. Technical Report #88, Computer Science Department, The University of Wisconsin, Madison, WI, USA. Reprinted (1990) in ICCA Journal, Vol. 13, No. 2, [|pdf]
 * J. Lawrence Carter, Mark N. Wegman (**1977**). //[|Universal classes of hash functions]//. [|STOC '77]
 * Robert Hyatt, Anthony Cozzie (**2005**). //[|The Effect of Hash Signature Collisions in a Chess Program]//. ICGA Journal, Vol. 28., No. 3
 * Borko Bošković, Sašo Greiner, Janez Brest, Viljem Žumer (**2005**). //[|The Representation of Chess Game]//. Proceedings of the 27th International Conference on Information Technology Interfaces
 * Mihai Pătrașcu, Mikkel Thorup (**2011**). //The Power of Simple Tabulation Hashing//. [|arXiv:1011.5200v2]

=Forum Posts=

1982 ...

 * [|compact representation of chess positions] by Tom Truscott, net.chess, January 7, 1982

1990 ...
> [|Re: Hash tables - Clash!!! What happens next?] by Jonathan Schaeffer, March 17, 1994
 * [| Hash tables - Clash!!! What happens next?] by Valavan Manohararajah, rgc, March 15, 1994
 * [|Collision probability] by Dennis Breuker, rgcc, April 15, 1996
 * [|Re: Berliner vs. Botvinnik Some interesting points] by Bradley C. Kuszmaul, rgcc, November 6, 1996
 * [|Re: Hashing function for board positions]by Jim Gillogly, rgcc, May 12, 1997
 * [|Fast hash algorithm] by John Scalo, CCC, January 08, 1998
 * [|Fast hash key method - Revisited!] by John Scalo, CCC, January 14, 1998
 * [|How to create a set of random integers for hashing?] by Ed Schröder, CCC, October 18, 1998

2000 ...

 * [|Why Random Number Needed In HashFunction[piece[position]]] by Cheok Yan Cheng, rgcc, June 12, 2001
 * [|About random numbers and hashing] by Severi Salminen, CCC, December 04, 2001
 * [|Random keys and hamming distance] by James Swafford, CCC, August 16, 2002
 * [|Hamming distance and lower hash table indexing] by Tom Likens, CCC, September 02, 2003
 * [|64-Bit random numbers] by Martin Schreiber, CCC, October 28, 2003
 * [|Is it necessary to include empty fields in the hash key of a position?] by Frank Hablizel, rgcc, December 25, 2003
 * [|Hashkey collisions (typical numbers)] by Renze Steenhuisen, CCC, April 07, 2004

2005 ...

 * [|Zobrist key random numbers] by Robert Hyatt, CCC, January 21, 2009
 * [|Incremental Zobrist - slow?] by Vlad Stamate, CCC, June 20, 2009 » Incremental Updates
 * [|On Zobrist keys] by Lasse Hansen, CCC, June 21, 2009
 * [|Overlapped Zobrist keys array] by Stefano Gemma, CCC, October 06, 2009

2010 ...

 * [|Transposition table random numbers] by Justin Madru, CCC, July 13, 2010
 * [|TT Key Collisions, Workarounds?] by Clemens Pruell, CCC, August 16, 2011
 * [|Key collision handling] by Jonatan Pettersson, CCC, October 21, 2011
 * [|Using a Transposition Table with Zobrist Keys] by Miyagi403, OpenChess Forum, February 21, 2012
 * [|MT or KISS ?] by Dan Honeycutt, CCC, June 02, 2012
 * [|Zobrist alternative?] by Harm Geert Muller, CCC, June 12, 2012
 * [|Zobrist Number Statistics and WHat to Look For] by Andrew Templeton, CCC, October 16, 2012
 * [|Question about Zobrist code] by Hamfer, OpenChess Forum, December 19, 2012

2015 ...
> [|Re: On-the fly hash key generation?] by Aleks Peshkov, CCC, January 13, 2016
 * [|Zobrist keys - measure of quality?] by Martin Sedlak, CCC, February 24, 2015
 * [|On-the fly hash key generation?] by Evert Glebbeek, CCC, January 12, 2016
 * [|Rotated hash] by J. Wesley Cleveland, CCC, September 13, 2016
 * [|No Zobrist key] by Henk van den Belt, CCC, September 26, 2016
 * [|Enpass + Castling for Zorbist hashes] by Andrew Grant, CCC, January 06, 2017 » Castling rights, En passant
 * [|Zobrist hashing for text] by Alvaro Cardoso, CCC, January 20, 2018

=External Links= > media type="youtube" key="Y7LD1iTl-lM"
 * [|Zobrist hashing from Wikipedia]
 * [|Tabulation hashing from Wikipedia]
 * [|Zobrist keys] from Bruce Moreland's [|Programming Topics]
 * [|Zobrist keys] from [|Mediocre Chess] by Jonatan Pettersson
 * [|Gödel numbering from Wikipedia]
 * John Cage - [|Music of Changes], Book 1 (1951), performed by [|Vicky Chow], [|DiMenna Center], [|NYC], June 09, 2012, [|YouTube] Video

=References= =What links here?= include page="Zobrist Hashing" component="backlinks" limit="120"
 * Up one Level**