The other day I annouced the Texas Hold 'em SQL Challenge.
I haven't gotten any feedback on it yet, so I have no idea if anyone is
working on it, but I thought I'd get the ball rolling and come up with
my own solution...

The first step I've decided to take in solving this problem is
to figure out how, given two or more 5-card hands, to figure out a
winner. What's necessary is a way of weighting each hand so that better
hands will have a higher value. Or at least, that's the tactic I took.

I first checked out this list of poker hands from strongest to weakest. After staring at it for a while, I realized that some patterns can be derived from the list:

- A two-pair is two pairs (wow, getting deep here, aren't we!)
- A full house is a pair plus a three of a kind
- A straight flush is a straight plus a flush
- A royal flush is a type of straight flush with the
*highest cards in the deck*

Obvious, yes; but the final point is important -- is there a way of
weighting the cards such that we can determine, using only a single
number, the weight of a given hand, independent of whether or not it
falls into one of these categories? That is, how can we know that a
straight flush with a high king is worth less than a straight flush
with a high ace? And how can we know that a hand with a pair of 10s and
a high jack beats a hand with a pair of 10s and a high 9?

The answer is to use our old friend, binary:

Card | Value |

2 | 2 |

3 | 4 |

4 | 8 |

5 | 16 |

6 | 32 |

7 | 64 |

8 | 128 |

9 | 256 |

10 | 512 |

Jack | 1024 |

Queen | 2048 |

King | 4096 |

Ace | 8192 (1) |

Note that the ace will be weighted as 1, should it happen to be part of a straight with a high 5.

So what does this binary weighting give us? The property that's
especially useful in this case is that for every power of 2, it's
impossible to reach that number by adding up any unique combination of
powers of 2 less than that power. Restated in terms of this project, if
a hand has only a high card, say a king, then that hand will be
weighted higher than any other high card hand that doesn't have either
a king or an ace.

But weighting alone isn't enough. This can be shown using the following two hands:

2 3 Q K A

2 3 Q K K

The weighted values of these hands are 14342 and 10246,
respectively (2 + 4 + 2048 + 4096 + 8192 and 2 + 4 + 2048 + 4096 +
4096). However, the second is a better hand. How to solve this issue?
Bonuses.

After conducting a completely non-scientific trial-and-error
analysis, I arrived at the following bonus structure, which seems to
work:

To weight a hand, take the base weight (i.e., the sum of the
weights of each card in the hand) and add the following bonuses per
condition:

- Pair: 2000000000 + (8192 * pair card weight)
- Three of a kind: 6000000000 + (131072 * three of a kind card weight)
- Straight: 5500000000
- Flush: 5800000000
- Four of a kind: 9500000000 + (131072 * four of a kind card weight)

Note that "pair card weight," for example, refers to the weight of
the card type that makes up the pair. So for a pair of twos, the
calculation would be (8192 * 2).

This formula can be expanded for each type of hand. Two pair?
Add 2000000000 + (8192 * the weight of the card for the first pair) +
(8192 * the weight of the card for the second pair). Full house?
Combine the pair with the three of a kind. Same with straight flush.
And since the base weights are being added, the hands will naturally be
weighted heavier for higher cards.

There may have been a more elegant solution, but this one does the trick for now.

But enough math, **on to the SQL**! What I wanted was a
table containing every possible hand and its weights. But first I
needed a table with all of the possible cards:

CREATE TABLE dbo.Cards

(

CardID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,

Suit CHAR(1) NOT NULL,

Card VARCHAR(2) NOT NULL,

Weight INT NOT NULL

)

INSERT dbo.Cards (Suit, Card, Weight)

SELECT S.Suit, C.Card, C.Weight

FROM

(

SELECT 'H'

UNION ALL

SELECT 'D'

UNION ALL

SELECT 'S'

UNION ALL

SELECT 'C'

) S (Suit)

CROSS JOIN

(

SELECT CONVERT(VARCHAR(2), number), POWER(2, number-1)

FROM master..spt_values

WHERE type='p'

AND number BETWEEN 2 AND 10

UNION ALL

SELECT 'J', 1024

UNION ALL

SELECT 'Q', 2048

UNION ALL

SELECT 'K', 4096

UNION ALL

SELECT 'A', 8192

) C (Card, Weight)

ORDER BY

Suit,

Weight

GO

Yes, I know that (Suit, Card) could have formed a natural primary
key, but the table of hands is going ta have 2598960 rows -- which you
can prove using the formulas from this website:

52 cards, 5 cards per hand:

n!

n_C_k = ----------

k!(n - k)!

52!

= -----------

5!(52 - 5)!

52!

= -----------

5!(47)!

52 * 51 * 50 * 49 * 48

= ----------------------

5 * 4 * 3 * 2 * 1

311875200

= -----------

120

= 2598960

...And to believe that my high school algebra teacher had the gall to flunk me!

Anyway, with 2.59 million rows, I'd rather just use a single-column
surrogate. It makes like a lot easier. So how do we generate all of
these rows?

First, we need every possible hand:

SELECT

C1.CardId AS Card1,

C2.CardId AS Card2,

C3.CardId AS Card3,

C4.CardId AS Card4,

C5.CardId AS Card5

FROM dbo.Cards C1

JOIN dbo.Cards C2 ON

C2.CardId > C1.CardId

JOIN dbo.Cards C3 ON

C3.CardId > C2.CardId

JOIN dbo.Cards C4 ON

C4.CardId > C3.CardID

JOIN dbo.Cards C5 ON

C5.CardId > C4.CardId

Easy enough. And notice how nice and simple that surrogate makes life... Laziness is a virtue!

Next step, apply the weighting logic. This is quite a bit more
complex, and I'm not going to spend a lot of time going into detail
since it's currently quite late here and I need to work in the morning.
But remember the following rules: If you have two or more of the same
index in a hand (pair, three of a kind, four of a kind), that hand
can't be a flush. And if you have only two of a given card, that hand
can't have a three of a kind of that card. And if you have only three
of a card, that hand can't have a four of a kind for that card!

Given these rules, I realized that you can test for all of these
conditions and add up the results, without doing any kind of
backtracking. This made the job quite a bit simpler. I did the
following: First, calculate the base weight and figure out whether we
have a straight, for any given row. Add up the results. Then "unpivot"
(using a cross-join to five rows of numbers) and use aggregates to
figure out what pairs, threes of a kind, fours of a kind, and flushes
exist in each hand... Re-pivot, and stuff everything into a table. Oh,
and I implemented some rudimentary batching so that 2.5 million rows
wouldn't be inserted in a single transaction.

The complete SQL is here:

CREATE TABLE dbo.PokerHands

(

Card1 INT NOT NULL,

Card2 INT NOT NULL,

Card3 INT NOT NULL,

Card4 INT NOT NULL,

Card5 INT NOT NULL,

Weight BIGINT NOT NULL,

CONSTRAINT PK_PokerHands PRIMARY KEY CLUSTERED

(Card1, Card2, Card3, Card4, Card5)

)

GO

DECLARE @CardId INT

SET @CardId = 1

WHILE @CardId <= 52

BEGIN

INSERT dbo.PokerHands

(Card1, Card2, Card3, Card4, Card5, Weight)

SELECT

y.Card1,

y.Card2,

y.Card3,

y.Card4,

y.Card5,

-- 2

CASE

SUM(

CASE y.CardWeight

WHEN 2 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 2)

WHEN 3 THEN 6000000000 + (131072 * 2)

WHEN 2 THEN 2000000000 + (8192 * 2)

ELSE 0

END +

-- 3

CASE

SUM(

CASE y.CardWeight

WHEN 4 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 4)

WHEN 3 THEN 6000000000 + (131072 * 4)

WHEN 2 THEN 2000000000 + (8192 * 4)

ELSE 0

END +

-- 4

CASE

SUM(

CASE y.CardWeight

WHEN 8 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 8)

WHEN 3 THEN 6000000000 + (131072 * 8)

WHEN 2 THEN 2000000000 + (8192 * 8)

ELSE 0

END +

-- 5

CASE

SUM(

CASE y.CardWeight

WHEN 16 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 16)

WHEN 3 THEN 6000000000 + (131072 * 16)

WHEN 2 THEN 2000000000 + (8192 * 16)

ELSE 0

END +

-- 6

CASE

SUM(

CASE y.CardWeight

WHEN 32 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 32)

WHEN 3 THEN 6000000000 + (131072 * 32)

WHEN 2 THEN 2000000000 + (8192 * 32)

ELSE 0

END +

-- 7

CASE

SUM(

CASE y.CardWeight

WHEN 64 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 64)

WHEN 3 THEN 6000000000 + (131072 * 64)

WHEN 2 THEN 2000000000 + (8192 * 64)

ELSE 0

END +

-- 8

CASE

SUM(

CASE y.CardWeight

WHEN 2 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 128)

WHEN 3 THEN 6000000000 + (131072 * 128)

WHEN 2 THEN 2000000000 + (8192 * 128)

ELSE 0

END +

-- 9

CASE

SUM(

CASE y.CardWeight

WHEN 256 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 256)

WHEN 3 THEN 6000000000 + (131072 * 256)

WHEN 2 THEN 2000000000 + (8192 * 256)

ELSE 0

END +

-- 10

CASE

SUM(

CASE y.CardWeight

WHEN 512 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 512)

WHEN 3 THEN 6000000000 + (131072 * 512)

WHEN 2 THEN 2000000000 + (8192 * 512)

ELSE 0

END +

-- J

CASE

SUM(

CASE y.CardWeight

WHEN 1024 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 1024)

WHEN 3 THEN 6000000000 + (131072 * 1024)

WHEN 2 THEN 2000000000 + (8192 * 1024)

ELSE 0

END +

-- Q

CASE

SUM(

CASE y.CardWeight

WHEN 2048 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 2048)

WHEN 3 THEN 6000000000 + (131072 * 2048)

WHEN 2 THEN 2000000000 + (8192 * 2048)

ELSE 0

END +

-- K

CASE

SUM(

CASE y.CardWeight

WHEN 4096 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 4096)

WHEN 3 THEN 6000000000 + (131072 * 4096)

WHEN 2 THEN 2000000000 + (8192 * 4096)

ELSE 0

END +

-- A

CASE

SUM(

CASE y.CardWeight

WHEN 8192 THEN 1

ELSE 0

END

)

WHEN 4 THEN 9500000000 + (131072 * 8192)

WHEN 3 THEN 6000000000 + (131072 * 8192)

WHEN 2 THEN 2000000000 + (8192 * 8192)

ELSE 0

END +

--Flush

CASE COUNT(DISTINCT y.CardSuit)

WHEN 1 THEN 5800000000

ELSE 0

END +

--Base Weight + Straight

MIN(y.TotalWeight) AS Weight

FROM

(

SELECT TOP 100 PERCENT

C1.CardId AS Card1,

C2.CardId AS Card2,

C3.CardId AS Card3,

C4.CardId AS Card4,

C5.CardId AS Card5,

--Straight

CASE WHEN

(C2.Weight = 2 * C1.Weight

AND C3.Weight = 2 * C2.Weight

AND C4.Weight = 2 * C3.Weight

AND (

(C5.Weight = 2 * C4.Weight)

OR

(C2.Card = '2' AND C5.Card = 'A')))

THEN

CASE

WHEN (C2.Card = '2' AND C5.Card = 'A')

THEN 5500000000 - 8191 --Make A = weight 1

ELSE 5500000000

END

ELSE 0

END +

--Base weight

C1.Weight + C2.Weight + C3.Weight + C4.Weight + C5.Weight

AS TotalWeight,

CASE x.number

WHEN 1 THEN C1.Weight

WHEN 2 THEN C2.Weight

WHEN 3 THEN C3.Weight

WHEN 4 THEN C4.Weight

WHEN 5 THEN C5.Weight

END AS CardWeight,

CASE x.number

WHEN 1 THEN C1.Suit

WHEN 2 THEN C2.Suit

WHEN 3 THEN C3.Suit

WHEN 4 THEN C4.Suit

WHEN 5 THEN C5.Suit

END AS CardSuit

FROM dbo.Cards C1

JOIN dbo.Cards C2 ON

C2.CardId > C1.CardId

JOIN dbo.Cards C3 ON

C3.CardId > C2.CardId

JOIN dbo.Cards C4 ON

C4.CardId > C3.CardID

JOIN dbo.Cards C5 ON

C5.CardId > C4.CardId

CROSS JOIN

(

SELECT number

FROM master..spt_values

WHERE type='p'

AND number BETWEEN 1 and 5

) x

WHERE C1.CardId = @CardId

ORDER BY

C1.CardId,

C2.CardId,

C3.CardId,

C4.CardId,

C5.CardId

) y

GROUP BY

y.Card1,

y.Card2,

y.Card3,

y.Card4,

y.Card5

SET @CardId = @CardId + 1

END

... And that's about it! Assuming I made no logic mistakes -- which
of course is a very safe assumption because it's me and I never do that
-- this is a relatively good start to the Challenge! So take it from
here... Or at least until I post the next piece of the puzzle.

In retrospect, I think this would be a good use for a CLR UDT.
If I have some time, I'll post that version in the coming days. Enjoy!

## Comments