FPGA, (Field-programmable gate array)
a field-programmableintegrated circuit consisting of a two-dimensional array of logic blocks interconnected by a hierarchy of reconfigurable routing channels. The behavior of a FPGA is defined by a schematic design or by a hardware description language (HDL), most notably VHDL and Verilog. FPGA cards of their main suppliers Xilinx[1] and Altera[2] can be plugged into a PC with communication over the PCI or PCI Express bus. IBM'sPOWER8 processor, introduced in August 2013, features a CAPI port (Coherent Accelerator Processor Interface) is layered on top of PCI Express 3.0 suited to connect custom hardware such as FPGAs [3][4].
The structure is a two-dimensional array of logic blocks and reconfigurable routing channels, which all have the same width (number of wires). I/O pads can connect to any one of the wiring segments in the channel adjacent to it [6].
Each logic block (configurable logic block CLB, or logic array block LAB) consists of one or more logical cells (LC, adaptive logic module ALM, logic element LE, Slice etc.), each with a n-input bits (4-6) to one-output bit programmable lookup table (LUT) - the combinatorial logic, and a D-Flip-Flop, which synchronizes and stores the output by the edge of a clock signal to implement a sequential logic. A configurable multiplexer either switches the direct or latched LUT output outwards.
Inputs and outputs of a cell can connect to any one of the routing wires in the channel adjacent to it. Whenever a vertical and a horizontal channel intersect there is a switch box with programmable switches that allow it to connect to other wires in adjacent channel segments. Xilinx Virtex devices further provide BlockRAM, a 4096-bit synchronous memory which can be configured for single or dual port usage with variable widths of 1, 2, 4, 8 or 16 bits.
In his Masters thesis [11], Marc Boulé proposed a FPGA move generator, as used by his chess program MBChess. His approach performs a Belle like move masking method with find victim and find aggressor cycles in MVV-LVA manner. A 1-bit, 64-deep synchronous memory in each square is used to memorize masked bits. The move generator includes a PCI interface to connect it to the PC running MBChess. Communication is done via different commands, such as to instruct the move generator to undo the currently stored move, generate and return the next move and execute that move on its internal FPGA board representation. In total, 10,804 out of 18,816 logic cells of a Xilinx XCV800 [12] were used, 10,104 as LUT, 700 as RAM [13].
A block diagram of a chess square with transmitter (TX) and the receiver (RX) [14]
Donninger
Brutus[15] and its successor Hydra by Chrilly Donninger et al. [16] perform the last 3 plies of an n-ply search on the FPGA side, inclusively the quiescence search and evaluations. It uses 67 out of 96 BlockRAMs, 534 of 24,576 Flip-Flops, and 18,403 of 24,576 LUTs. An upper bound for the number of cycles per search node is 9. Hydra essentially contains a big piece of combinatorial logic, controlled by a finite state machine (FSM) with 54 states for the search. The move generator consists of the generate aggressor module and the generate victim module, both instantiate 64 square modules, one for each square.
The squares send piece-signals if any, respectively forwarding the signals of sliding pieces. Each square can output the signal ’victim found’ to indicate the ’victim’ is target square of a pseudo-legal move. The collection of all ’victim found’ signals is the input for a comparator tree, an arbiter, which selects the most attractive, not yet examined victim. The Generate Aggressor module takes the arbiter’s output as input and sends the signal of a super-piece from the target to find one or more origin squares. Selection criteria are the values of attacked pieces and whether or not a move is a killer move.
Youhei Hori, Masashi Sonoyama, Tsutomu Maruyama (2002). An FPGA-Based Processor for Shogi Mating Problems. 2002 IEEE International Conference on Field-Programmable Technology, 2002, pdf
Marc Boulé, Zeljko Zilic (2003). FPGA Hardware Acceleration: From Chess Playing to Automated Theorem Proving. poster presentation, Micronet Sept. 2003
a field-programmable integrated circuit consisting of a two-dimensional array of logic blocks interconnected by a hierarchy of reconfigurable routing channels. The behavior of a FPGA is defined by a schematic design or by a hardware description language (HDL), most notably VHDL and Verilog. FPGA cards of their main suppliers Xilinx [1] and Altera [2] can be plugged into a PC with communication over the PCI or PCI Express bus. IBM's POWER8 processor, introduced in August 2013, features a CAPI port (Coherent Accelerator Processor Interface) is layered on top of PCI Express 3.0 suited to connect custom hardware such as FPGAs [3] [4].
Table of Contents
Architecture
Structure
Blocks and Cells
Routing
FPGA in Computer Chess
FPGAs are suited to implement a Belle like move generator in hardware. While Marc Boulé proposed a pure generation approach as used by his program MBChess, Chrilly Donninger, with PCI-communication overhead in mind, went some steps further in Brutus [10] and Hydra, using a complete 3-ply iterative search including quiescence and evaluation, controlled by a finite state machine (FSM).Boulé
In his Masters thesis [11], Marc Boulé proposed a FPGA move generator, as used by his chess program MBChess. His approach performs a Belle like move masking method with find victim and find aggressor cycles in MVV-LVA manner. A 1-bit, 64-deep synchronous memory in each square is used to memorize masked bits. The move generator includes a PCI interface to connect it to the PC running MBChess. Communication is done via different commands, such as to instruct the move generator to undo the currently stored move, generate and return the next move and execute that move on its internal FPGA board representation. In total, 10,804 out of 18,816 logic cells of a Xilinx XCV800 [12] were used, 10,104 as LUT, 700 as RAM [13].Donninger
Brutus [15] and its successor Hydra by Chrilly Donninger et al. [16] perform the last 3 plies of an n-ply search on the FPGA side, inclusively the quiescence search and evaluations. It uses 67 out of 96 BlockRAMs, 534 of 24,576 Flip-Flops, and 18,403 of 24,576 LUTs. An upper bound for the number of cycles per search node is 9. Hydra essentially contains a big piece of combinatorial logic, controlled by a finite state machine (FSM) with 54 states for the search. The move generator consists of the generate aggressor module and the generate victim module, both instantiate 64 square modules, one for each square.The squares send piece-signals if any, respectively forwarding the signals of sliding pieces. Each square can output the signal ’victim found’ to indicate the ’victim’ is target square of a pseudo-legal move. The collection of all ’victim found’ signals is the input for a comparator tree, an arbiter, which selects the most attractive, not yet examined victim. The Generate Aggressor module takes the arbiter’s output as input and sends the signal of a super-piece from the target to find one or more origin squares. Selection criteria are the values of attacked pieces and whether or not a move is a killer move.
Publications
Forum Posts
2000 ...
2005 ...
2010 ...
External Links
Field-programmable analog array from Wikipedia
FPGA Architecture for the Challenge
VHDL Checkers Implementation
Vendors
Misc
References
What links here?
Up one Level