Lecture: Artificial Intelligence, by
Richard Belew
Final Project:
Playing Rummikub
using a "human based" approach
by Hendrik Lübben
Introduction
The following text describes an implementation of the game "Rummikub",
in which it has been tried to simulate human strategies of approaching
this game. The basic idea is outlined first, referring also to
existing and planned implementations by other people. An overview
of the development of this approach is then presented and finally
some results and problems are discussed.
For a characterisation of Rummikub and an explanation of the rules of
the game, please look at the section "Rummikub", subsections
"Characterisation" and "Terminology and rules" of the my
Game Playing Asignment.
Basic idea and other implementations
When searching the web for other implementations of Rummikub, I was
only able to find one single
working program
by
N. van Rooij (Netherlands).
The author did not comment in detail on his implementation, but
claims to use simple exhaustive search in his program.
I also communicated with the
Lemada Light Industries Ltd.
in Tel Aviv, Israel, who own the international rights on the game
Rummikub. They currently have two computer implementations of this
game under development. Even though their first answer, giving me this
information plus making sure that I know, that I am not allowed to use
any Rummikub implementation commercially, came very fast, they did not
comment at all on queries about the general approach used in the two
developments currently under way.
The basic idea of using a "goal-directed" search originated from two
thoughts. First I expected, that an exhaustive search would become
to slow to be a viable solution in the later stages of the game,
because the search space grows roughly exponentially with the number of
cards available for modifications. Secondly, judging from many years
of experience in playing Rummikub, implementing a human-style search
seemed rather harmless to implement.
My first implementation, which also used exhaustive search, needed
reaction times far above any tolerable limit, even though I tried
three different versions and reached a net speed-up factor of more than
a thousand in comparison to the first try. I did not have the possibility to
try out the program by N. van Rooij, but if it does indeed have tolerable
answering times even in the later stages of the game, when there are lots
of cards on the table, he must either have used some very tricky methods
to prune the search space, or his algorithm is not truely exhaustive.
Since the Lemada company seems to have at least two groups of programmers
working on this problem, it seems rather likely, that the slow reaction
times of my program are a true problem and not due to some great pruning
possibility which I just did not see.
First implementation
The first implementation
works fine for roughly the first 25 to 30 ply, after that, though, the
number of cards on the table passes the critical limit and within
just a few ply, the answering time jumps from milliseconds to several
days per ply.
This version of the program in each turn first computes the set of all
possible melds. This is an easy task and only takes milliseconds even
for a lot of cards to be considered. To compute all possible rows,
for each card the algorithm adds the one point higher one to a temporary
set, as long as that card is in the hand. Every set that is
encountered this way that contains three or more cards is added to
the collection. E.g. if a hand contains 3,4,5,6 of one color, starting
with 3 the algorithm will first reach three cards for the meld 3,4,5,
it will then find 6 and add 3,4,5,6 to the collection, then it will
restart with the next higher card (i.e. 4), and will next find
4,5,6. Starting with 5 and 6 it will not reach any rows with three or
more cards, so the collection will be {(3,4,5), (3,4,5,6), (4,5,6)}.
Finding all possible set is even easier, the algorithm simply checks
for all card values from ace to king, if the hand contains three or
four different ones of these and if so adds the corresponding sets
to the collection.
Obviously, many of these melds are mutually exclusive, so the main
task for the algorithm is then to find that subset of non mutually
exclusive melds that has the greatest value, minimizing the amount
of cards on the players hand. To achieve this, a recursive routine
(GetBestCombination), originally called for the complete set of all
possible melds and an empty suggested move, in each recursive step
does the following
- (If not the lowest recursion level:) Remove all melds form the
collection of all possible melds which overlap with the move built
up so far
- Remove one meld from the collection
- Once for this meld added to the suggested move and once
without adding it, do a recursion for the reduced collection.
As soon as the collection is empty, the current suggested move is
compared with the best one found so far and replaces the latter if
it is better.
In each turn each player now simply takes up the entire table (if he
has his first 30 points already, otherwise the search is restricted
to the cards on the hand) and computes the best combination.
If that combination leaves less cards on his hand than he had before,
he does the corresponding move, else he picks up a card from the stack.
A simple trick which reduces computation time was already implemented
in this version. If a player makes a move and until it is his turn the
next time, all other players only take up cards, nothing on the table
will have changed. In that case, he does not even start to search
again but takes a card from the stack directly. This of course reduces
the average, the main problem of too long answering times is not
addressed by this trick, though.
A different approach, that might indeed be helpfull, but which I did
not have the time to test, would be to try to divide the amount of
search to be done more evenly between the calculation of all possible
melds and the calculation of the best subset, thereby reducing the
number in the exponent (as far as complexity is concerned). This could
be done, by calculating a separate collection for all of those melds,
which obviously exclude each other and therefore will not occur
together in a suggested move, anyway. An example for that are the
three melds derived from 3,4,5,6 above.
Since the main part of this approach, the GetBestCombination-Routine,
is still contained in the "human based" version of this program -
there it is used to find melds on the hand, which is very fast due
to the extremely small search space - this version of the program is
not in the net. If you are interested in this source code, too, please
send me a short email.
The "human based" implementation
In order to ensure, that the
implementation
using the human based approach does not become one
which is only based on the way I usually play Rummikub, I contacted
several friends of mine, of whom I knew that they also play Rummikub.
I furthermore tried to get in contact with other Rummikub players
via the internet (e.g. via the internets Rummikub club), this did
not work, though, I did not receive any answers.
It turned out, that most of the people I asked did at first not even
understand the question, and did not comment on how they figure out
which cards they can get rid of in one move, but on long-term
strategies. The number of such strategies is very small and I will
comment on them first, before then describing the short-term
"tactics" used within one move.
Long-term strategies
In the early stages of the game, some players tend to keep some
complete melds on their hand for an additional ply, if they have
already made a change on the table in the current ply. Thus they
prevent that they might have to pick up a card from the stack in
the next ply, if they cannot do any other changes then. This is
no longer practised in or near the end game, since then the
chances that another player might finsh, before the meld can
be placed on the table, become to big.
It is very unsure, though, if this strategy really is a good one.
(I never practised it and did fine.) Since this applies only for
the early game, a card that is taken up will be in the hand for
many ply and generally increase its quality by increasing its
flexibility (the number of possible combinations). When playing
and paying attention to the cards one has to take from the stack,
roughly half of them will be regarded as an improvement to the
hand rather than just as additional points.
It is virtually impossible to compute a relieable estimate on
this question though, the background noise is just too big.
One thing, I hoped to be able to do with this implememtation,
was, to program two players, one with each strategy, and have
them play several thousand games. To compare that, also have
two players with the same strategy play against each other for
the same number of games and this way measure, if one of the
strategies leads to a statistically significant increase in the
quality of playing (i.e. in a statistically significant
decrease in the number of points left on average on the hand
of the corresponding player at the end of a game).
The second long-term strategy refers to the end game, and this
is, where to computer could get a real advantage. None of the
players I asked even try to remember, what cards their opponents
took up from the table - with the exception of some very
immediate an obvious cases, e.g. if the player right before
me took up a King of Spades, I will try to avoid putting down
a joker for a King of Spades in my next move, but look for other
possibilities. In general, in the end game, with lots of cards
on the table, it is - for humans - too computationally expensive,
to determine which of their possible moves not only minimizes the
number of points/cards on their hand but also try to take into
account the possibilities created that way for the other players.
For this implementation, as a simplifying assumption, trying
to achieve the optimal move it defined merely as trying to find
the move which leaves the smallest amount of points on the hand.
Once the algorithm works reliably, it is always possible to integrate
such preferences into it, later.
Short-term tactics
To reduce the number of cards for which a more complicated search is
neccessary, just like most human players the program first tries to
get rid of as many cards as possible in the simple ways. After having
figured out if there are complete melds on the hand (and if so, having
placed them on the table), the algorithm first looks for cards which
can be directly added to the ends of rows or inserted into them (e.g.
a 6 can be inserted into 4,5,6,7,8, giving 4,5,6 and 6,7,8; this is
referred to as row-cutting).
This stage is quite straight forward. While doing the implementation,
some cases showed up which had to be treated separately, but they were
all comparably easily integrated into the algorithm. E.g. while the
first attempt to implement adding cards to rows was to go over all
cards and find out, if there is a place where they fit, I changed this
to going over all ends of melds and figuring out, if the card that can
be placed there is on the hand. This is computationally favorable, but
it brought up a problem, which I found later when analysing games
played by the program: Melds on the table are not sorted. Assume the
table (for some arbitrary color) contains ace,2,3 and later 5,6,7.
If a hand has the cards 3,4 of the same color, the program will first
find the possibility to add the 4 to ace,2,3 and will no longer be
able to place the 3 when encountering the second row, since the 4
is missing. Again looking towards the human approach, the best way
to solve this problem was to try to place couples of cards first and
only after that do the search described above, which works fine.
Another trick that saves computation time is to remove all melds that
can be placed on the table from the hand before looking for cards
to add to rows, but only put them onto the table after having done
that search. This reduces the number of searches, but will not
cut off any possible solutions, since any card fitting to a meld from
the hand would have been added to it straight away.
The hard part then is to figure out those moves, which are only
possible by moving cards that already are on the table from one meld
to another. The way an advanced human player approaches this problem
involves something comparable to a breadth-first recursive search.
When trying to get rid of a card, he will look iteratively for all
the cards that fulfill this purpose. Most of these will only be
available with leaving cards behind that no longer form a legal
meld. He will then basically in parallel try to find other places
on the table where he can put these cards, which again might involve
needing other cards ...
In this algorithm, the first step - and that is implemented and works fine -
restricts the search to such cards that leave a legal meld when
removed. The GetRid-routine is iteratively called for each card on
the hand and uses the LookFor-routine to systematically look for all
cards that could be used to place that card on the table, by
integrating it into a new row or set or by filling in a gap between
this card and an already existing row. Like this, the algorithm
already plays at a higher level than that of a novice, but it is
still easily beaten by an experienced player - unless the cards are
very much in favor of the computer player.
I have not been able to implement the last and hardest part so far,
the one which involves temporarily taking up several cards while
trying to restructure the table. The main problem with that is the
following: When trying to get rid of a card, this has to be achieved
either in a row or in a set, which makes is neccessary to sequentially
search these two possibilities. This algorithm tries a row first and
only resorts to a set if a row is not possible. Now assume the following
situation: Trying to get rid of Jack of Spades, the algorithm finds
the Queen of Spades "for free" (e.g. from a set of four queens) but
only gets the 10 of Spades at the price of the 10's of Diamonds and
Hearts (e.g. because it had to be taken from a set of three 10's).
Now assume further that there are possibilities to get rid of both those
10's, but only by adding it to a row for the 10 of Diamonds and only
by adding it to a set for the 10 of Hearts - of course this scenario
can easily be made arbitrarily more complex, this description is
just to describe the main problem. Trying to get rid of the two 10's,
the algorithm will now only find the solution for the 10 of Hearts,
after is has already given up on trying rows - and therefore at a
time at which it will already have forgotten the only solution for the
10 of Diamonds.
Even though the deadline for this project is reached already, I will
continue to work on this program in my spare time, since the idea
still intrigues me and I have by far not yet given up the hope to
complete the "human based" approach implementation. Additionally,
I will look for and try other ways of pruning the search space for
the exhaustive search to find out if it is possible to get a working
solution out of that direction.
Since this article will be published on the Web: Anybody who is
interested to be informed on future successes (or failures) of
this algorithm, please write me a short email, so I can tell
you of new versions.
Hendrik Lübben San Diego, December 10, 1997