Wednesday, May 13, 2015

Match-three puzzle games algorithm


I am trying to write a match-three puzzle game like 'call of Atlantis' myself. The most important algorithm is to find out all possible match-three possibilities. Is there any open source projects that can be referenced? Or any keywords to the algorithm? I am trying to look for a faster algorithm to calculate all possibilities.


Rules:



  • Diagonals don't count.

  • The size is 8x8


Thanks.



Answer




I helped port one of these games to a handheld platform. Looking back at their AI code for finding potentials: yipe, it's complicated, brutish (quadruply nested loop, calls itself recursively occasionally, etc), and it doesn't appear at all cache-friendly at first glance.


(Some of that complexity comes from trying to evaluate the strength of the move in context: valuing longer chains higher, looking for combos, etc...)


But it doesn't really need to be "optimal"; we didn't even touch the code when we ported it. It never showed up on the profiler.


Looking at it now, even at one 32 bit word per cell (and I think they actually used a byte per cell), the entire board will fit in a minuscule L1 cache, and you can do many excess reads on stuff that's cached without impacting framerate too much. Especially since you only need to do this whole process once each time the board configuration changes. (A big-theta hovering around n^2 isn't that horribly bad with a very low n, not to mention the small multiplier given the cached memory.)


That having been said: for amusement, let's attempt to parallelize the problem. Starting with bitwise operations.


Assume you have a bitmask representing all of the pieces (we'll call them stones) in a row that are of one particular type (we'll use colors to distinguish types). We'll start only looking at red stones, and worry about the cost of calculating the bitmask later.


// Let's assume top right indexing.
// (The assumption is not necessary, --
// it just makes the left-shift and right-shift operators
// look like they're pointing in the correct direction.)


// this is for row 2
col index 76543210
color BRRGYRBR // blue, red, red, green, yellow, ...
"red" bits 01100101

We're looking for the series that need just one swap to become a series of 3. Courtesy Kaj, this is one of three combinations, basically: XoX, oXX, or XXo where X is a matching stone and o is something else.


(This idea is borrowed from the marvelous Hacker's Delight book; see also the fxtbook if such things delight you.)


// using c-style bitwise operators:
// & is "and"

// ^ is "xor"
// | is "or"
// << and >> are arithmetic (non-sign-extending) shifts

redBitsThisRow = redBitsRows[2]

// find the start of an XoX sequence
startOfXoXSequence = redBitsThisRow & (redBitsThisRow << 2);
// for our example, this will be 00000100


// find any two stones together in a row
startOfXXSequence = redBitsThisRow & (redBitsThisRow << 1);
// for our example, this will be 01000000

It's more useful to know the positions of the missing stones, not the start of the XX or XoX sequence:


// give us any sequences that might want a stone from the left
missingLeftStone = startOfXXSequence << 1;
// for our example, this will be 10000000

// give us any sequences that might want a stone from the right

missingRightStone = startOfXXSequence >> 2;
// for our example, this will be 00010000

// give us any sequences that might want a stone from the top or bottom
missingTopOrBottomStone = missingRightStone | missingLeftStone | (startOfXoXSequence >> 1)
// for our example, this will be 10010010

(About 1 load and 9 ALU instructions -- 5 shifts, 2 ors, 2 ands -- with a dreadful CPU that doesn't include an inline shifter. On many architectures these shifts might be free.)


Can we fill these missing places?


// look to the left, current row

leftMatches = redBitsThisRow & (missingLeftStone << 1)

// look to the right, current row
rightMatches = redBitsThisRow & (missingRightStone >> 1)

// look on the row above
topMatches = redBitsRow[1] & missingTopOrBottomStone

// look on the row below
bottomMatches = redBitsRow[3] & missingTopOrBottomStone


(Another 2 loads and 6 ALU instructions -- 4 ands, 2 shifts -- with a bad CPU. Note that row 0 and row 7 may give you trouble -- you can choose to branch around these calculations, or avoid the branch by allocating space for two extra rows, one before 0 and one after 7, and leave them zeroed out.)


Now we have several "matches" vars that indicate the position of a stone that can be swapped in to make a match.


This assumes a "count trailing zeros" intrinsic or very cheap inline method:


swapType = RIGHT_TO_LEFT;
matches = leftMatches;
while ( (colIdx = ctz(matches)) < WORD_BITS ) {
// rowIdx is 2 in our examples above
workingSwaps.insert( SwapInfo(rowIdx, colIdx, swapType) );
// note that this SwapInfo construction could do some more advanced logic:

// run the swap on a temporary board and see how much score it accumulates
// assign some sort of value based on preferring one type of match to another, etc

matches = matches ^ (1<}
// repeat for LEFT_TO_RIGHT with rightMatches
// repeat for TOP_TO_BOTTOM with topMatches
// repeat for BOTTOM_TO_TOP with bottomMatches

Note that none of this bit logic should break down in little-endian vs big-endian environments. It gets much more tricky for boards wider than your machine word size. (You could experiment with something like std::bitset for this.)



What about columns? It may be easiest just to have two copies of the table, one stored in row-order and one stored in column-order. If you have getters and setters wrapping board access, this should be trivial. I don't mind keeping two arrays up to date, after all a set becomes rowArray[y][x] = newType; colArray[x][y] = newType; and that's simple.


...but managing rowBits[color][row] and colBits[color][col] becomes obnoxious.


However, as a quick aside, if you do have rowBits and colBits, you can run the same code with rowBits pointing at colBits instead. Pseudocode, assuming board width = board height = 8 in this case...


foreach color in colors {
foreach bits in rowBits, colBits {
foreach row in 0..7 { // row is actually col the second time through
// find starts, as above but in bits[row]
// find missings, as above
// generate matches, as above but in bits[row-1], bits[row], and bits[row+1]
// loop across bits in each matches var,

// evaluate and/or collect them, again as above
}
}
}

What if we don't want the bother with converting a nice 2D array into bits? With an 8x8 board, 8 bits per cell, and a 64-bit-capable processor, we might be able to get away with it: 8 cells = 8 bytes = 64 bits. We're locked to our board width, now, but this seems promising.


Assume "0" is reserved, stones begin at 1 and go to NUM_STONE_TYPES inclusive.


startOfXXSequence = rowBytes ^ (rowBytes << (8*1))
// now all bytes that are 0x00 are the start of a XX sequence


startOfXoXSequence = rowBytes ^ (rowBytes << (8*2))
// all bytes that are 0x00 are the start of a XoX sequence

Note this doesn't need one pass per color. In BRBRBRGY we'll get a startOfXoXSequence that might be something like 0x00 00 00 00 aa bb cc dd -- the top four bytes will be zero, indicating a possible sequence starts there.


It's getting late, so I'll leave off here and possibly come back later -- you can continue along this vein with xors and "detect first zero byte" tricks, or you could look into some integer SIMD extensions.


No comments:

Post a Comment

Simple past, Present perfect Past perfect

Can you tell me which form of the following sentences is the correct one please? Imagine two friends discussing the gym... I was in a good s...