THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Adam Machanic

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

Texas Hold 'em Hint #1

The other day I annouced the Texas Hold 'em SQL Challenge. I haven't gotten any feedback on it yet, so I have no idea if anyone is working on it, but I thought I'd get the ball rolling and come up with my own solution...

The first step I've decided to take in solving this problem is to figure out how, given two or more 5-card hands, to figure out a winner. What's necessary is a way of weighting each hand so that better hands will have a higher value. Or at least, that's the tactic I took.

I first checked out this list of poker hands from strongest to weakest. After staring at it for a while, I realized that some patterns can be derived from the list:

 

  • A two-pair is two pairs (wow, getting deep here, aren't we!)
  • A full house is a pair plus a three of a kind
  • A straight flush is a straight plus a flush
  • A royal flush is a type of straight flush with the highest cards in the deck

Obvious, yes; but the final point is important -- is there a way of weighting the cards such that we can determine, using only a single number, the weight of a given hand, independent of whether or not it falls into one of these categories? That is, how can we know that a straight flush with a high king is worth less than a straight flush with a high ace? And how can we know that a hand with a pair of 10s and a high jack beats a hand with a pair of 10s and a high 9?

The answer is to use our old friend, binary:

 

CardValue
22
34
48
516
632
764
8128
9256
10512
Jack1024
Queen2048
King4096
Ace8192 (1)

Note that the ace will be weighted as 1, should it happen to be part of a straight with a high 5.

So what does this binary weighting give us? The property that's especially useful in this case is that for every power of 2, it's impossible to reach that number by adding up any unique combination of powers of 2 less than that power. Restated in terms of this project, if a hand has only a high card, say a king, then that hand will be weighted higher than any other high card hand that doesn't have either a king or an ace.

But weighting alone isn't enough. This can be shown using the following two hands:

2 3 Q K A
2 3 Q K K

The weighted values of these hands are 14342 and 10246, respectively (2 + 4 + 2048 + 4096 + 8192 and 2 + 4 + 2048 + 4096 + 4096). However, the second is a better hand. How to solve this issue? Bonuses.

After conducting a completely non-scientific trial-and-error analysis, I arrived at the following bonus structure, which seems to work:

To weight a hand, take the base weight (i.e., the sum of the weights of each card in the hand) and add the following bonuses per condition:

 

  • Pair: 2000000000 + (8192 * pair card weight)
  • Three of a kind: 6000000000 + (131072 * three of a kind card weight)
  • Straight: 5500000000
  • Flush: 5800000000
  • Four of a kind: 9500000000 + (131072 * four of a kind card weight)

Note that "pair card weight," for example, refers to the weight of the card type that makes up the pair. So for a pair of twos, the calculation would be (8192 * 2).

This formula can be expanded for each type of hand. Two pair? Add 2000000000 + (8192 * the weight of the card for the first pair) + (8192 * the weight of the card for the second pair). Full house? Combine the pair with the three of a kind. Same with straight flush. And since the base weights are being added, the hands will naturally be weighted heavier for higher cards.

There may have been a more elegant solution, but this one does the trick for now.

But enough math, on to the SQL! What I wanted was a table containing every possible hand and its weights. But first I needed a table with all of the possible cards:

 

CREATE TABLE dbo.Cards
(
CardID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
Suit CHAR(1) NOT NULL,
Card VARCHAR(2) NOT NULL,
Weight INT NOT NULL
)

INSERT dbo.Cards (Suit, Card, Weight)
SELECT S.Suit, C.Card, C.Weight
FROM
(
SELECT 'H'
UNION ALL
SELECT 'D'
UNION ALL
SELECT 'S'
UNION ALL
SELECT 'C'
) S (Suit)
CROSS JOIN
(
SELECT CONVERT(VARCHAR(2), number), POWER(2, number-1)
FROM master..spt_values
WHERE type='p'
AND number BETWEEN 2 AND 10
UNION ALL
SELECT 'J', 1024
UNION ALL
SELECT 'Q', 2048
UNION ALL
SELECT 'K', 4096
UNION ALL
SELECT 'A', 8192
) C (Card, Weight)
ORDER BY
Suit,
Weight
GO

Yes, I know that (Suit, Card) could have formed a natural primary key, but the table of hands is going ta have 2598960 rows -- which you can prove using the formulas from this website:

 

52 cards, 5 cards per hand:

n!
n_C_k = ----------
k!(n - k)!

52!
= -----------
5!(52 - 5)!

52!
= -----------
5!(47)!

52 * 51 * 50 * 49 * 48
= ----------------------
5 * 4 * 3 * 2 * 1

311875200
= -----------
120

= 2598960

...And to believe that my high school algebra teacher had the gall to flunk me!

Anyway, with 2.59 million rows, I'd rather just use a single-column surrogate. It makes like a lot easier. So how do we generate all of these rows?

First, we need every possible hand:

 

SELECT
C1.CardId AS Card1,
C2.CardId AS Card2,
C3.CardId AS Card3,
C4.CardId AS Card4,
C5.CardId AS Card5
FROM dbo.Cards C1
JOIN dbo.Cards C2 ON
C2.CardId > C1.CardId
JOIN dbo.Cards C3 ON
C3.CardId > C2.CardId
JOIN dbo.Cards C4 ON
C4.CardId > C3.CardID
JOIN dbo.Cards C5 ON
C5.CardId > C4.CardId

Easy enough. And notice how nice and simple that surrogate makes life... Laziness is a virtue!

Next step, apply the weighting logic. This is quite a bit more complex, and I'm not going to spend a lot of time going into detail since it's currently quite late here and I need to work in the morning. But remember the following rules: If you have two or more of the same index in a hand (pair, three of a kind, four of a kind), that hand can't be a flush. And if you have only two of a given card, that hand can't have a three of a kind of that card. And if you have only three of a card, that hand can't have a four of a kind for that card!

Given these rules, I realized that you can test for all of these conditions and add up the results, without doing any kind of backtracking. This made the job quite a bit simpler. I did the following: First, calculate the base weight and figure out whether we have a straight, for any given row. Add up the results. Then "unpivot" (using a cross-join to five rows of numbers) and use aggregates to figure out what pairs, threes of a kind, fours of a kind, and flushes exist in each hand... Re-pivot, and stuff everything into a table. Oh, and I implemented some rudimentary batching so that 2.5 million rows wouldn't be inserted in a single transaction.

The complete SQL is here:

 

CREATE TABLE dbo.PokerHands
(
Card1 INT NOT NULL,
Card2 INT NOT NULL,
Card3 INT NOT NULL,
Card4 INT NOT NULL,
Card5 INT NOT NULL,
Weight BIGINT NOT NULL,
CONSTRAINT PK_PokerHands PRIMARY KEY CLUSTERED
(Card1, Card2, Card3, Card4, Card5)
)
GO

DECLARE @CardId INT
SET @CardId = 1

WHILE @CardId <= 52
BEGIN
INSERT dbo.PokerHands
(Card1, Card2, Card3, Card4, Card5, Weight)
SELECT
y.Card1,
y.Card2,
y.Card3,
y.Card4,
y.Card5,
-- 2
CASE
SUM(
CASE y.CardWeight
WHEN 2 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 2)
WHEN 3 THEN 6000000000 + (131072 * 2)
WHEN 2 THEN 2000000000 + (8192 * 2)
ELSE 0
END +
-- 3
CASE
SUM(
CASE y.CardWeight
WHEN 4 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 4)
WHEN 3 THEN 6000000000 + (131072 * 4)
WHEN 2 THEN 2000000000 + (8192 * 4)
ELSE 0
END +
-- 4
CASE
SUM(
CASE y.CardWeight
WHEN 8 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 8)
WHEN 3 THEN 6000000000 + (131072 * 8)
WHEN 2 THEN 2000000000 + (8192 * 8)
ELSE 0
END +
-- 5
CASE
SUM(
CASE y.CardWeight
WHEN 16 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 16)
WHEN 3 THEN 6000000000 + (131072 * 16)
WHEN 2 THEN 2000000000 + (8192 * 16)
ELSE 0
END +
-- 6
CASE
SUM(
CASE y.CardWeight
WHEN 32 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 32)
WHEN 3 THEN 6000000000 + (131072 * 32)
WHEN 2 THEN 2000000000 + (8192 * 32)
ELSE 0
END +
-- 7
CASE
SUM(
CASE y.CardWeight
WHEN 64 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 64)
WHEN 3 THEN 6000000000 + (131072 * 64)
WHEN 2 THEN 2000000000 + (8192 * 64)
ELSE 0
END +
-- 8
CASE
SUM(
CASE y.CardWeight
WHEN 2 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 128)
WHEN 3 THEN 6000000000 + (131072 * 128)
WHEN 2 THEN 2000000000 + (8192 * 128)
ELSE 0
END +
-- 9
CASE
SUM(
CASE y.CardWeight
WHEN 256 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 256)
WHEN 3 THEN 6000000000 + (131072 * 256)
WHEN 2 THEN 2000000000 + (8192 * 256)
ELSE 0
END +
-- 10
CASE
SUM(
CASE y.CardWeight
WHEN 512 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 512)
WHEN 3 THEN 6000000000 + (131072 * 512)
WHEN 2 THEN 2000000000 + (8192 * 512)
ELSE 0
END +
-- J
CASE
SUM(
CASE y.CardWeight
WHEN 1024 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 1024)
WHEN 3 THEN 6000000000 + (131072 * 1024)
WHEN 2 THEN 2000000000 + (8192 * 1024)
ELSE 0
END +
-- Q
CASE
SUM(
CASE y.CardWeight
WHEN 2048 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 2048)
WHEN 3 THEN 6000000000 + (131072 * 2048)
WHEN 2 THEN 2000000000 + (8192 * 2048)
ELSE 0
END +
-- K
CASE
SUM(
CASE y.CardWeight
WHEN 4096 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 4096)
WHEN 3 THEN 6000000000 + (131072 * 4096)
WHEN 2 THEN 2000000000 + (8192 * 4096)
ELSE 0
END +
-- A
CASE
SUM(
CASE y.CardWeight
WHEN 8192 THEN 1
ELSE 0
END
)
WHEN 4 THEN 9500000000 + (131072 * 8192)
WHEN 3 THEN 6000000000 + (131072 * 8192)
WHEN 2 THEN 2000000000 + (8192 * 8192)
ELSE 0
END +
--Flush
CASE COUNT(DISTINCT y.CardSuit)
WHEN 1 THEN 5800000000
ELSE 0
END +
--Base Weight + Straight
MIN(y.TotalWeight) AS Weight
FROM
(
SELECT TOP 100 PERCENT
C1.CardId AS Card1,
C2.CardId AS Card2,
C3.CardId AS Card3,
C4.CardId AS Card4,
C5.CardId AS Card5,
--Straight
CASE WHEN
(C2.Weight = 2 * C1.Weight
AND C3.Weight = 2 * C2.Weight
AND C4.Weight = 2 * C3.Weight
AND (
(C5.Weight = 2 * C4.Weight)
OR
(C2.Card = '2' AND C5.Card = 'A')))
THEN
CASE
WHEN (C2.Card = '2' AND C5.Card = 'A')
THEN 5500000000 - 8191 --Make A = weight 1
ELSE 5500000000
END
ELSE 0
END +
--Base weight
C1.Weight + C2.Weight + C3.Weight + C4.Weight + C5.Weight
AS TotalWeight,
CASE x.number
WHEN 1 THEN C1.Weight
WHEN 2 THEN C2.Weight
WHEN 3 THEN C3.Weight
WHEN 4 THEN C4.Weight
WHEN 5 THEN C5.Weight
END AS CardWeight,
CASE x.number
WHEN 1 THEN C1.Suit
WHEN 2 THEN C2.Suit
WHEN 3 THEN C3.Suit
WHEN 4 THEN C4.Suit
WHEN 5 THEN C5.Suit
END AS CardSuit
FROM dbo.Cards C1
JOIN dbo.Cards C2 ON
C2.CardId > C1.CardId
JOIN dbo.Cards C3 ON
C3.CardId > C2.CardId
JOIN dbo.Cards C4 ON
C4.CardId > C3.CardID
JOIN dbo.Cards C5 ON
C5.CardId > C4.CardId
CROSS JOIN
(
SELECT number
FROM master..spt_values
WHERE type='p'
AND number BETWEEN 1 and 5
) x
WHERE C1.CardId = @CardId
ORDER BY
C1.CardId,
C2.CardId,
C3.CardId,
C4.CardId,
C5.CardId
) y
GROUP BY
y.Card1,
y.Card2,
y.Card3,
y.Card4,
y.Card5

SET @CardId = @CardId + 1
END

... And that's about it! Assuming I made no logic mistakes -- which of course is a very safe assumption because it's me and I never do that -- this is a relatively good start to the Challenge! So take it from here... Or at least until I post the next piece of the puzzle.

In retrospect, I think this would be a good use for a CLR UDT. If I have some time, I'll post that version in the coming days. Enjoy!


Published Wednesday, July 12, 2006 10:46 PM by Adam Machanic
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement