The ChessMachine was a chess computer sold between 1991 and 1995 by TASC (The Advanced Software Company). It was unique at the time for incorporating both an ARM2 coprocessor for the chess engine on an ISA card which plugged into an IBM PC and a software interface running on the PC to display a chess board and control the engine.
64-489: The ISA card was sold with a CPU running at either 16 MHz or 32 MHz, and 128 KB, 512 KB, or 1 MB of onboard memory for transposition tables . This made economic sense at the time of introduction because mainstream PCs were only running from 10 MHz to 25 MHz. Two engines were sold with the card: The King by Johann de Koning and Gideon by Ed Schröder. Gideon was famed for winning two World Computer Chess Championships on this hardware. The King later became
128-401: A L o o k u p ( k e y , command ) {\displaystyle \mathrm {Lookup} (\mathrm {key} ,{\text{command}})} wrapper such that each element in the bucket gets rehashed and its procedure involve as follows: Linear hashing is an implementation of the hash table which enables dynamic growths or shrinks of the table one bucket at
192-447: A dynamic array found to be more cache-friendly is used in the place where a linked list or self-balancing binary search trees is usually deployed, since the contiguous allocation pattern of the array could be exploited by hardware-cache prefetchers —such as translation lookaside buffer —resulting in reduced access time and memory consumption. Open addressing is another collision resolution technique in which every entry record
256-404: A hash table is a data structure that implements an associative array , also called a dictionary or simply map ; an associative array is an abstract data type that maps keys to values . A hash table uses a hash function to compute an index , also called a hash code , into an array of buckets or slots , from which the desired value can be found. During lookup, the key is hashed and
320-659: A set of (key, value) pairs and allows insertion, deletion, and lookup (search), with the constraint of unique keys . In the hash table implementation of associative arrays, an array A {\displaystyle A} of length m {\displaystyle m} is partially filled with n {\displaystyle n} elements, where m ≥ n {\displaystyle m\geq n} . A value x {\displaystyle x} gets stored at an index location A [ h ( x ) ] {\displaystyle A[h(x)]} , where h {\displaystyle h}
384-435: A hash function works, one can then focus on finding the fastest possible such hash function. A search algorithm that uses hashing consists of two parts. The first part is computing a hash function which transforms the search key into an array index . The ideal case is such that no two search keys hashes to the same array index. However, this is not always the case and is impossible to guarantee for unseen given data. Hence
448-424: A property that, the cost of finding the desired item from any given buckets within the neighbourhood is very close to the cost of finding it in the bucket itself; the algorithm attempts to be an item into its neighbourhood—with a possible cost involved in displacing other items. Each bucket within the hash table includes an additional "hop-information"—an H -bit bit array for indicating the relative distance of
512-408: A solution is to perform the resizing gradually to avoid storage blip—typically at 50% of new table's size—during rehashing and to avoid memory fragmentation that triggers heap compaction due to deallocation of large memory blocks caused by the old hash table. In such case, the rehashing operation is done incrementally through extending prior memory block allocated for the old hash table such that
576-406: A table is a hash table of each of the positions analyzed so far up to a certain depth. On encountering a new position, the program checks the table to see whether the position has already been analyzed; this can be done quickly, in amortized constant time. If so, the table contains the value that was previously assigned to this position; this value is used directly. If not, the value is computed, and
640-488: Is a stub . You can help Misplaced Pages by expanding it . Transposition table A transposition table is a cache of previously seen positions, and associated evaluations, in a game tree generated by a computer game playing program. If a position recurs via a different sequence of moves, the value of the position is retrieved from the table, avoiding re-searching the game tree below that position. Transposition tables are primarily useful in perfect-information games (where
704-399: Is a common method of implementation of hash tables. Let T {\displaystyle T} and x {\displaystyle x} be the hash table and the node respectively, the operation involves as follows: If the element is comparable either numerically or lexically , and inserted into the list by maintaining the total order , it results in faster termination of
SECTION 10
#1733086175491768-411: Is a hash function, and h ( x ) < m {\displaystyle h(x)<m} . Under reasonable assumptions, hash tables have better time complexity bounds on search, delete, and insert operations in comparison to self-balancing binary search trees . Hash tables are also commonly used to implement sets, by omitting the stored value for each key and merely tracking whether
832-409: Is a non-integer real-valued constant and m {\displaystyle m} is the size of the table. An advantage of the hashing by multiplication is that the m {\displaystyle m} is not critical. Although any value A {\displaystyle A} produces a hash function, Donald Knuth suggests using the golden ratio . Uniform distribution of
896-460: Is an open addressing based collision resolution algorithm; the collisions are resolved through favouring the displacement of the element that is farthest—or longest probe sequence length (PSL)—from its "home location" i.e. the bucket to which the item was hashed into. Although Robin Hood hashing does not change the theoretical search cost , it significantly affects the variance of the distribution of
960-642: Is available, values can be stored without regard for their keys, and a binary search or linear search can be used to retrieve the element. In many situations, hash tables turn out to be on average more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software , particularly for associative arrays , database indexing , caches , and sets . The idea of hashing arose independently in different places. In January 1953, Hans Peter Luhn wrote an internal IBM memorandum that used hashing with chaining. The first example of open addressing
1024-464: Is empty, the element is inserted, and the leftmost bit of bitmap is set to 1; if not empty, linear probing is used for finding an empty slot in the table, the bitmap of the bucket gets updated followed by the insertion; if the empty slot is not within the range of the neighbourhood, i.e. H -1, subsequent swap and hop-info bit array manipulation of each bucket is performed in accordance with its neighbourhood invariant properties . Robin Hood hashing
1088-466: Is found, which indicates an unsuccessful search. Well-known probe sequences include: The performance of open addressing may be slower compared to separate chaining since the probe sequence increases when the load factor α {\displaystyle \alpha } approaches 1. The probing results in an infinite loop if the load factor reaches 1, in the case of a completely filled table. The average cost of linear probing depends on
1152-418: Is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key–value pairs , at amortized constant average cost per operation. Hashing is an example of a space-time tradeoff . If memory is infinite, the entire key can be used directly as an index to locate its value with a single memory access. On the other hand, if infinite time
1216-615: Is inefficient and rarely done in practice. A transposition table is a cache whose maximum size is limited by available system memory, and it may overflow at any time. In fact, it is expected to overflow, and the number of positions cacheable at any time may be only a small fraction (even orders of magnitude smaller) than the number of nodes in the game tree. The vast majority of nodes are not transposition nodes, i.e. positions that will recur, so effective replacement strategies that retain potential transposition nodes and replace other nodes can result in significantly reduced tree size. Replacement
1280-399: Is not just the evaluation of a single position. Instead, the evaluation of an entire subtree is avoided. Thus, transposition table entries for nodes at a shallower depth in the game tree are more valuable (since the size of the subtree rooted at such a node is larger) and are therefore given more importance when the table fills up and some entries must be discarded. The hash table implementing
1344-482: Is resolved through maintaining two hash tables, each having its own hashing function, and collided slot gets replaced with the given item, and the preoccupied element of the slot gets displaced into the other hash table. The process continues until every key has its own spot in the empty buckets of the tables; if the procedure enters into infinite loop —which is identified through maintaining a threshold loop counter—both hash tables get rehashed with newer hash functions and
SECTION 20
#17330861754911408-480: Is said to be perfect for a given set S {\displaystyle S} if it is injective on S {\displaystyle S} , that is, if each element x ∈ S {\displaystyle x\in S} maps to a different value in 0 , . . . , m − 1 {\displaystyle {0,...,m-1}} . A perfect hash function can be created if all
1472-406: Is stored in the bucket array itself, and the hash resolution is performed through probing . When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence , until an unoccupied slot is found. When searching for an entry, the buckets are scanned in the same sequence, until either the target record is found, or an unused array slot
1536-484: Is the hash value of x ∈ S {\displaystyle x\in S} and m {\displaystyle m} is the size of the table. The scheme in hashing by multiplication is as follows: h ( x ) = ⌊ m ( ( x A ) mod 1 ) ⌋ {\displaystyle h(x)=\lfloor m{\bigl (}(xA){\bmod {1}}{\bigr )}\rfloor } Where A {\displaystyle A}
1600-407: Is tried first. For storing the best child of a node, the entry corresponding to that node in the transposition table is used. Use of a transposition table can lead to incorrect results if the graph-history interaction problem is not studiously avoided. This problem arises in certain games because the history of a position may be important. For example, in chess a player may not castle if the king or
1664-435: Is usually based on tree depth and aging: nodes higher in the tree (closer to the root) are favored, because the subtrees below them are larger and result in greater savings; and more recent nodes are favored because older nodes are no longer similar to the current position, so transpositions to them are less likely. Other strategies are to retain nodes in the principal variation, nodes with larger subtrees regardless of depth in
1728-411: The integer universe assumption that all elements of the table stem from the universe U = { 0 , . . . , u − 1 } {\displaystyle U=\{0,...,u-1\}} , where the bit length of u {\displaystyle u} is confined within the word size of a computer architecture . A hash function h {\displaystyle h}
1792-458: The memory footprint of game playing programs. Game-playing programs work by analyzing millions of positions that could arise in the next few moves of the game. Typically, these programs employ strategies resembling depth-first search , which means that they do not keep track of all the positions analyzed so far. In many games, it is possible to reach a given position in more than one way. These are called transpositions . In chess , for example,
1856-525: The R30 held its own against its contemporary programs running a Pentium -90 MHz and won against other dedicated units. This chess-related article is a stub . You can help Misplaced Pages by expanding it . This video game -related article on computer hardware is a stub . You can help Misplaced Pages by expanding it . This digital board game-related article is a stub . You can help Misplaced Pages by expanding it . This artificial intelligence -related article
1920-443: The application. In particular, if one uses dynamic resizing with exact doubling and halving of the table size, then the hash function needs to be uniform only when the size is a power of two . Here the index can be computed as some range of bits of the hash function. On the other hand, some hashing algorithms prefer to have the size be a prime number . For open addressing schemes, the hash function should also avoid clustering ,
1984-754: The bucket array holds exactly one item. Therefore an open-addressed hash table cannot have a load factor greater than 1. The performance of open addressing becomes very bad when the load factor approaches 1. Therefore a hash table that uses open addressing must be resized or rehashed if the load factor α {\displaystyle \alpha } approaches 1. With open addressing, acceptable figures of max load factor α max {\displaystyle \alpha _{\max }} should range around 0.6 to 0.75. A hash function h : U → { 0 , . . . , m − 1 } {\displaystyle h:U\rightarrow \{0,...,m-1\}} maps
ChessMachine - Misplaced Pages Continue
2048-441: The bucket array stores a pointer to a list or array of data. Separate chaining hash tables suffer gradually declining performance as the load factor grows, and no fixed point beyond which resizing is absolutely needed. With separate chaining, the value of α max {\displaystyle \alpha _{\max }} that gives best performance is typically between 1 and 3. With open addressing, each slot of
2112-454: The buckets of the hash table remain unaltered. A common approach for amortized rehashing involves maintaining two hash functions h old {\displaystyle h_{\text{old}}} and h new {\displaystyle h_{\text{new}}} . The process of rehashing a bucket's items in accordance with the new hash function is termed as cleaning , which is implemented through command pattern by encapsulating
2176-621: The buckets or nodes link within the table. The algorithm is ideally suited for fixed memory allocation . The collision in coalesced hashing is resolved by identifying the largest-indexed empty slot on the hash table, then the colliding value is inserted into that slot. The bucket is also linked to the inserted node's slot which contains its colliding hash address. Cuckoo hashing is a form of open addressing collision resolution technique which guarantees O ( 1 ) {\displaystyle O(1)} worst-case lookup complexity and constant amortized time for insertions. The collision
2240-476: The engine used in the popular Chessmaster series of chess programs. TASC later incorporated the technology into a dedicated unit, sold from 1993 to 1997. There were two models, the R30 and R40 , running at 30 MHz and 40 MHz respectively, and having 512 KB and 1 MB of transposition tables, respectively. The SmartBoard, a wooden sensory board, was connected to the units, which were in tiny boxes approximately
2304-586: The entire state of the game is known to all players at all times). The usage of transposition tables is essentially memoization applied to the tree search and is a form of dynamic programming . Transposition tables are typically implemented as hash tables encoding the current board position as the hash index. The number of possible positions that may occur in a game tree is an exponential function of depth of search, and can be thousands to millions or even much greater. Transposition tables may therefore consume most of available system memory and are usually most of
2368-460: The hash function's ability to distribute the elements uniformly throughout the table to avoid clustering , since formation of clusters would result in increased search time. Since the slots are located in successive locations, linear probing could lead to better utilization of CPU cache due to locality of references resulting in reduced memory latency . Coalesced hashing is a hybrid of both separate chaining and open addressing in which
2432-424: The hash table and j {\displaystyle j} be the index, the insertion procedure is as follows: Repeated insertions cause the number of entries in a hash table to grow, which consequently increases the load factor; to maintain the amortized O ( 1 ) {\displaystyle O(1)} performance of the lookup and insertion operations, a hash table is dynamically resized and
2496-414: The hash table whenever the load factor α {\displaystyle \alpha } reaches α max {\displaystyle \alpha _{\max }} . Similarly the table may also be resized if the load factor drops below α max / 4 {\displaystyle \alpha _{\max }/4} . With separate chaining hash tables, each slot of
2560-420: The hash values is a fundamental requirement of a hash function. A non-uniform distribution increases the number of collisions and the cost of resolving them. Uniformity is sometimes difficult to ensure by design, but may be evaluated empirically using statistical tests, e.g., a Pearson's chi-squared test for discrete uniform distributions. The distribution needs to be uniform only for table sizes that occur in
2624-448: The item which was originally hashed into the current virtual bucket within H -1 entries. Let k {\displaystyle k} and B k {\displaystyle Bk} be the key to be inserted and bucket to which the key is hashed into respectively; several cases are involved in the insertion procedure such that the neighbourhood property of the algorithm is vowed: if B k {\displaystyle Bk}
ChessMachine - Misplaced Pages Continue
2688-454: The items of the tables are rehashed into the buckets of the new hash table, since the items cannot be copied over as varying table sizes results in different hash value due to modulo operation . If a hash table becomes "too empty" after deleting some elements, resizing may be performed to avoid excessive memory usage . Generally, a new hash table with a size double that of the original hash table gets allocated privately and every item in
2752-474: The items on the buckets, i.e. dealing with cluster formation in the hash table. Each node within the hash table that uses Robin Hood hashing should be augmented to store an extra PSL value. Let x {\displaystyle x} be the key to be inserted, x . p s l {\displaystyle x.psl} be the (incremental) PSL length of x {\displaystyle x} , T {\displaystyle T} be
2816-404: The key is present. A load factor α {\displaystyle \alpha } is a critical statistic of a hash table, and is defined as follows: load factor ( α ) = n m , {\displaystyle {\text{load factor}}\ (\alpha )={\frac {n}{m}},} where The performance of the hash table deteriorates in relation to
2880-548: The keys are known ahead of time. The schemes of hashing used in integer universe assumption include hashing by division, hashing by multiplication, universal hashing , dynamic perfect hashing , and static perfect hashing . However, hashing by division is the commonly used scheme. The scheme in hashing by division is as follows: h ( x ) = x mod m {\displaystyle h(x)\ =\ x\,{\bmod {\,}}m} where h ( x ) {\displaystyle h(x)}
2944-402: The load factor α {\displaystyle \alpha } . The software typically ensures that the load factor α {\displaystyle \alpha } remains below a certain constant, α max {\displaystyle \alpha _{\max }} . This helps maintain good performance. Therefore, a common approach is to resize or "rehash"
3008-494: The look-up complexity to be a guaranteed O ( 1 ) {\displaystyle O(1)} in the worst case. In this technique, the buckets of k {\displaystyle k} entries are organized as perfect hash tables with k 2 {\displaystyle k^{2}} slots providing constant worst-case lookup time, and low amortized time for insertion. A study shows array-based separate chaining to be 97% more performant when compared to
3072-544: The mapping of two or more keys to consecutive slots. Such clustering may cause the lookup cost to skyrocket, even if the load factor is low and collisions are infrequent. The popular multiplicative hash is claimed to have particularly poor clustering behavior. K-independent hashing offers a way to prove a certain hash function does not have bad keysets for a given type of hashtable. A number of K-independence results are known for collision resolution schemes such as linear probing and cuckoo hashing. Since K-independence can prove
3136-400: The new position is entered into the hash table. The number of positions searched by a computer often greatly exceeds the memory constraints of the system it runs on; thus not all positions can be stored. When the table fills up, less-used positions are removed to make room for new ones; this makes the transposition table a kind of cache . The computation saved by a transposition table lookup
3200-407: The operations such as A d d ( k e y ) {\displaystyle \mathrm {Add} (\mathrm {key} )} , G e t ( k e y ) {\displaystyle \mathrm {Get} (\mathrm {key} )} and D e l e t e ( k e y ) {\displaystyle \mathrm {Delete} (\mathrm {key} )} through
3264-412: The original hash table gets moved to the newly allocated one by computing the hash values of the items followed by the insertion operation. Rehashing is simple, but computationally expensive. Some hash table implementations, notably in real-time systems , cannot pay the price of enlarging the hash table all at once, because it may interrupt time-critical operations. If one cannot avoid dynamic resizing,
SECTION 50
#17330861754913328-400: The problem of search in large files. The first published work on hashing with chaining is credited to Arnold Dumey , who discussed the idea of using remainder modulo a prime as a hash function. The word "hashing" was first published in an article by Robert Morris. A theoretical analysis of linear probing was submitted originally by Konheim and Weiss. An associative array stores
3392-615: The procedure continues. Hopscotch hashing is an open addressing based algorithm which combines the elements of cuckoo hashing , linear probing and chaining through the notion of a neighbourhood of buckets—the subsequent buckets around any given occupied bucket, also called a "virtual" bucket. The algorithm is designed to deliver better performance when the load factor of the hash table grows beyond 90%; it also provides high throughput in concurrent settings , thus well suited for implementing resizable concurrent hash table . The neighbourhood characteristic of hopscotch hashing guarantees
3456-415: The resulting hash indicates where the corresponding value is stored. A map implemented by a hash table is called a hash map . Most hash table designs employ an imperfect hash function . Hash collisions , where the hash function generates the same index for more than one key, therefore typically must be accommodated in some way. In a well-dimensioned hash table, the average time complexity for each lookup
3520-463: The rook to be castled with has moved during the course of the game. A common solution to this problem is to add the castling rights as part of the Zobrist hashing key. Another example is draw by repetition : given a position, it may not be possible to determine whether it has already occurred. A solution to the general problem is to store history information in each node of the transposition table, but this
3584-466: The second part of the algorithm is collision resolution. The two common methods for collision resolution are separate chaining and open addressing. In separate chaining, the process involves building a linked list with key–value pair for each search array index. The collided items are chained together through a single linked list, which can be traversed to access the item with a unique search key. Collision resolution through chaining with linked list
3648-447: The sequence of moves 1. d4 Nf6 2. c4 g6 (see algebraic chess notation ) has 4 possible transpositions, since either player may swap their move order. In general, after n moves, an upper limit on the possible transpositions is ( n !) . Although many of these are illegal move sequences, it is still likely that the program will end up analyzing the same position several times. To avoid this problem, transposition tables are used. Such
3712-459: The size of chess clocks. They were only sold with The King chess engine. This was the end of the era of strong dedicated chess computers, and these two models are acknowledged as the strongest dedicated chess computers that were ever sold. At the height of its strength, the R30 attained a rating over 2350 on computer rating lists, higher than any other dedicated unit. According to the SSDF rating list,
3776-550: The standard linked list method under heavy load. Techniques such as using fusion tree for each buckets also result in constant time for all operations with high probability. The linked list of separate chaining implementation may not be cache-conscious due to spatial locality — locality of reference —when the nodes of the linked list are scattered across memory, thus the list traversal during insert and search may entail CPU cache inefficiencies. In cache-conscious variants of collision resolution through separate chaining,
3840-432: The transposition table can have other uses than finding transpositions. In alpha–beta pruning , the search is fastest (in fact, optimal) when the child of a node corresponding to the best move is always considered first. Of course, there is no way of knowing the best move beforehand, but when iterative deepening is used, the move that was found to be the best in a shallower search is a good approximation. Therefore this move
3904-410: The tree, and nodes that caused cutoffs. Though the fraction of nodes that will be transpositions is small, the game tree is an exponential structure, so caching a very small number of such nodes can make a significant difference. In chess, search time reductions of 0-50% in complex middle game positions and up to a factor of 5 in the end game have been reported. Hash table In computing ,
SECTION 60
#17330861754913968-452: The universe U {\displaystyle U} of keys to indices or slots within the table, that is, h ( x ) ∈ { 0 , . . . , m − 1 } {\displaystyle h(x)\in \{0,...,m-1\}} for x ∈ U {\displaystyle x\in U} . The conventional implementations of hash functions are based on
4032-455: The unsuccessful searches. If the keys are ordered , it could be efficient to use " self-organizing " concepts such as using a self-balancing binary search tree , through which the theoretical worst case could be brought down to O ( log n ) {\displaystyle O(\log {n})} , although it introduces additional complexities. In dynamic perfect hashing , two-level hash tables are used to reduce
4096-488: Was proposed by A. D. Linh, building on Luhn's memorandum. Around the same time, Gene Amdahl , Elaine M. McGraw , Nathaniel Rochester , and Arthur Samuel of IBM Research implemented hashing for the IBM 701 assembler . Open addressing with linear probing is credited to Amdahl, although Andrey Ershov independently had the same idea. The term "open addressing" was coined by W. Wesley Peterson in his article which discusses
#490509