<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www2.sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Adam Machanic : T-SQL</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx</link><description>Tags: T-SQL</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>What Happened Today? DATE and Date Ranges Over DATETIME</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2009/10/20/what-happened-today-date-and-date-ranges-over-datetime.aspx</link><pubDate>Tue, 20 Oct 2009 18:21:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:18023</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>14</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/18023.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=18023</wfw:commentRss><description>A few days ago Aaron posted yet another fantastic entry in his Bad Habits series, this one discussing mishandling of date ranges in queries . This is a topic near and dear to me, having had to clean up a lot of poorly thought out code in the past few...(&lt;a href="http://www2.sqlblog.com/blogs/adam_machanic/archive/2009/10/20/what-happened-today-date-and-date-ranges-over-datetime.aspx"&gt;read more&lt;/a&gt;)&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=18023" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/date/default.aspx">date</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/datetime/default.aspx">datetime</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Optimization/default.aspx">Optimization</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Performance/default.aspx">Performance</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Grouped String Concatenation: ... The Winner Is ...</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2009/05/31/grouped-string-concatenation-the-winner-is.aspx</link><pubDate>Sun, 31 May 2009 20:07:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:14360</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>18</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/14360.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=14360</wfw:commentRss><description>&lt;p&gt;After weeks of putting it off, I finally found the time and spent the last day and a half judging the &lt;a href="http://sqlblog.com/blogs/adam_machanic/archive/2009/02/27/t-sql-challenge-grouped-string-concatenation.aspx"&gt;Grouped String Concatenation Challenge&lt;/a&gt;. I would like to congratulate the winner, &lt;a href="http://weblogs.sqlteam.com/peterl/"&gt;&lt;b&gt;Peter Larsson&lt;/b&gt;&lt;/a&gt;, who submitted a great query and walks away with a shiny new MSDN Premium subscription.&lt;/p&gt;&lt;p&gt;For those who are interested, following is a breakdown of the judging process, along with some commentary: &lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;b&gt;Submission Process&lt;/b&gt;&lt;/u&gt; &lt;br&gt;&lt;/p&gt;&lt;p&gt;To begin with, e-mails. As I mentioned in the first post, I ignored all e-mails that didn't follow the directions. Luckily this was only a few submissions. I felt it rather odd that people would spend a not insignificant amount of time working up a solution only to not bother to read the guidelines thoroughly. But that's human nature, I suppose.&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;b&gt;Round 1&lt;/b&gt;&lt;/u&gt; &lt;br&gt;&lt;/p&gt;&lt;p&gt;Once I collected all of the queries that followed the e-mail rules (all of which are included in the attached ZIP file), I began testing against an expanded version of AdventureWorks (the script for that is also included). I decided to &lt;a href="http://sqlblog.com/blogs/adam_machanic/archive/2009/03/15/the-grouped-string-concatenation-challenge-is-closed.aspx"&gt;eliminate any queries that did not produce the correct output data&lt;/a&gt; based on my sample set, or which took longer than 30 seconds to complete. The majority of queries did complete in a reasonable amount of time, and many were eliminated because the output simply wasn't correct. The biggest issue was ordering of the elements in the comma-delimited sets. I also deducted points from one person's entry because of invalid column names, but I decided to let the entry ride to the next round.&lt;/p&gt;&lt;p&gt;An important side note is that I created this competition with the sole intention of discovering new and different ways to do grouped string concatenation, and my hope was that someone would come up with a clever, fast solution. Unfortunately, that didn't happen, and every submission that used any technique except FOR XML PATH was eliminated in the first round of testing. I received some extremely creative solutions from a couple of people and I would like to mention them here:&lt;br&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Alejandro Mesa's submissions made use of various XQuery techniques, and are very interesting to look at, although fairly slow&lt;/li&gt;&lt;li&gt;Dean Cochrane's submission used an interesting idea of doing a MAX(CASE ...) pivot for the lists. Alas, the product names lists were not correct, so the submission didn't make it to the stress testing phase&lt;/li&gt;&lt;li&gt;Scott Coleman tried a similar technique, actually using the PIVOT keyword. Unfortunately, this ran for over 200 seconds, so it was eliminated&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Also interesting to note is that a few people tried recursive CTE solutions. These were all cancelled at the 300 second mark. Recursive CTEs, &lt;a href="http://sqlblog.com/blogs/linchi_shea/archive/2009/04/16/recursive-sql-queries-how-do-they-work.aspx"&gt;as mentioned before on here on SQLblog&lt;/a&gt;, simply do not scale in their current implementation.&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;b&gt;Round 2&lt;/b&gt;&lt;/u&gt;&lt;/p&gt;&lt;p&gt;After tabulating the Round 1 results I was left with 18 queries, and some obvious contenders.&amp;nbsp; I ran each query through a SQLQueryStress session with 10 threads running 5 iterations each. In this phase the queries were separated into fairly distinct groups: Those that ran for around 5 minutes, those that ran for around 7-8 minutes, and those that ran longer. These groups were based, not surprisingly, on how much attention was paid by the query writers to the little details. For example, Peter Larsson's winning query cut down on logical reads dramatically by doing some of the grouping in a derived table, rather than in the outermost query as some of the other submissions did. &lt;/p&gt;&lt;p&gt;Lesson learned: When doing aggregations, especially when joining a lot of tables, think about what you're &lt;i&gt;really&lt;/i&gt; aggregating, and do the aggregation as early as possible. For example, if you need to aggregate sales per customer and get customer names, do the aggregation of the sales numbers first, &lt;i&gt;then&lt;/i&gt; join out to get the customer names. Otherwise the query processor is forced to do more work than it has to do, and your query won't be as fast. Peter and a few other contestants understood this distinction and wrote queries that were much faster as a result.&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;b&gt;Round 3&lt;/b&gt;&lt;/u&gt;&lt;/p&gt;&lt;p&gt;Round 2 eliminated 4 queries, leaving me with 14 to judge based on query style. In order to judge consistently, I came up with 10 factors. A query was allotted 500 points to start, and failure to meet each factor resulted in a 50 point penalty. These factors were:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Consistent Indentation&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Does the query use the same rules for indentation in all parts? This is huge for readability and helps people understand where each section of the query starts and ends.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Consistent Capitalization&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Does the query use the same rules for capitalization throughout? For example, keywords should be either all capitalized, or all lowercase.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Capitalize Keywords&lt;/li&gt;&lt;ul&gt;&lt;li&gt;I like to see keywords capitalized.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Use AS for Alias Names&lt;/li&gt;&lt;ul&gt;&lt;li&gt;AS is optional, and I've left it out in many queries I've written. But the more of other peoples' code I read, the more I realize that it really does help on the readability front. Use it. Always.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Follow Capitalization of Base Tables/Columns&lt;/li&gt;&lt;ul&gt;&lt;li&gt;If the base table is called OrderHeader, I want to see it used as OrderHeader when referenced in your query, rather than orderheader. A trainer I know found this out the hard way, when he reinstalled SQL Server on his laptop shortly before a training session, and used a case-sensitive collation rather than his previously-installed case-insensitive collation. He had been careless in adhering to capitalization for his training materials, and discovered the issue in front of the class. Oops.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Long Horizontal Lists&lt;/li&gt;&lt;ul&gt;&lt;li&gt;I don't like horizontal scrolling, and I find long lists difficult to read.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Consistent Vertical Lists&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Put either a comma after each element or before each element, not both. Indent your lists the same way throughout. If you indent some items below the SELECT, don't put other items on the same line as the SELECT (or GROUP BY, or ORDER BY, etc)&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Alignment of Delimiters&lt;/li&gt;&lt;ul&gt;&lt;li&gt;I follow a .NET-inspired style where I put delimiters on their own lines, and line them up vertically. This gives my code what I feel is an airy, easy-to-read feel. When reading others' code I look for some kind of alignment. Failure to align delimiters makes it very difficult to understand, again, where one section begins and another ends. By the way, common delimiters for this challenge included both parens and CASE...END.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Comments&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Does the query have comments? Are the comments useful in understanding the logic?&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Aesthetics&lt;/li&gt;&lt;ul&gt;&lt;li&gt;This is perhaps the most subjective. My general feeling on how I enjoyed reading the code. &lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;p&gt;All in all, the queries were pretty good. I would like to call out Rick Halliday, who had the highest score in this round with some very well formatted and highly readable code.&lt;br&gt;&lt;br&gt;&lt;u&gt;&lt;b&gt;&lt;/b&gt;&lt;/u&gt;&lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;b&gt;Round 4&lt;/b&gt;&lt;/u&gt;&lt;br&gt;&lt;br&gt;After judging Round 3 I tallied all of the scores and was left with a tie for top 3:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Rick Halliday&lt;/li&gt;&lt;li&gt;Leonid Koyfman&lt;/li&gt;&lt;li&gt;Peter Larsson's query #4&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;All three of these queries were well thought out, but only one could win, so I took another pass through each. Rick's query, though extremely well written and readable, was eliminetad first due to the fact that it performed worse than the other two. This left Leonid and Peter. It was a tough choice, but I had to give the prize to Peter for taking the time to really think through the problem and figure out exactly how best to do the aggregations. Leonid was a very, &lt;i&gt;very &lt;/i&gt;close second, and I really wish I had a consolation prize for him.&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;u&gt;&lt;b&gt;The End&lt;/b&gt;&lt;/u&gt;&lt;/p&gt;&lt;p&gt;And that's that. Thank you to everyone who participated in the challenge. I hope it was as much a learning experience for you as it was for me. Congratulations again to Peter. All of the materials are attached in the ZIP file; please let me know if you have any questions, comments, etc. &lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=14360" width="1" height="1"&gt;</description><enclosure url="http://www2.sqlblog.com/blogs/adam_machanic/attachment/14360.ashx" length="52865" type="application/x-zip-compressed" /><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/challenge/default.aspx">challenge</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/concatenation/default.aspx">concatenation</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>The Grouped String Concatenation Challenge is Closed</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2009/03/15/the-grouped-string-concatenation-challenge-is-closed.aspx</link><pubDate>Mon, 16 Mar 2009 00:22:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:12641</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>35</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/12641.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=12641</wfw:commentRss><description>&lt;p&gt;Just over two weeks ago I posted the &lt;a href="http://sqlblog.com/blogs/adam_machanic/archive/2009/02/27/t-sql-challenge-grouped-string-concatenation.aspx"&gt;Grouped String Concatenation Challenge&lt;/a&gt;. A more difficult challenge than the last one I posted, partially in hopes that not as many people would submit solutions and it would be easier for me to judge. But alas, my readers are obviously really into this stuff, and I received submissions from 39 people, some of whom sent me up to four different solutions. I didn't expect such an amazing response, and now it's up to me to sort through all of these e-mails and declare a winner--no easy task.&lt;/p&gt;&lt;p&gt;The submissions I've looked at so far have been of very high quality, and most of them use FOR XML PATH('') in some way. A few people experimented with other techniques--which is exactly what I was hoping for--and I'll be especially interested in seeing how their solutions stack up. I also spent a lot of time this last two weeks working on alternative solutions, and most of my attempts didn't scale well.&amp;nbsp; I'm hoping that some reader managed to come up with a trick I haven't thought of yet.&lt;/p&gt;&lt;p&gt;The judging will be done as follows:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;I'll run each submission and test it for correctness against my own benchmark query (the results of which were attached to the original post). If any of the rows don't match exactly, that submission will be disqualified.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Next I will run each query twice, and will eliminate any query that doesn't run in less than four seconds on my notebook (the fastest ones I've already seen run in two).&lt;br&gt;&lt;/li&gt;&lt;li&gt;The remaining queries will be run through SQLQueryStress sessions on a version of AdventureWorks modified to have twice as many customers and eight times as many orders. Queries in this stage will be scored on a bell curve where the top 15% will receive a score of 10, the upper-middle 35% a score of 6, the lower-middle 35% a score of 2, and the rest a score of 0.&amp;nbsp; The 0s will be eliminated.&lt;/li&gt;&lt;li&gt;The remaining queries will then be evaluated for readability. This is purely subjective and since I'm the judge my idea of what constitutes readability wins. So I hope you've been reading my blog for a while and have absorbed some of my best practices in this area! Queries will be scored based on the same curve as before.&lt;/li&gt;&lt;li&gt;At this stage the queries with the highest combined scores will be weighted based on innovation and the ability to apply their techniques to a general pattern, in order to find the number one query.&lt;br&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Thanks again to everyone who submitted! I'm looking forward to judging the submissions. Watch this space for a comprehensive breakdown of the results, to be posted sometime in the next couple of weeks.&lt;br&gt;&lt;/p&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=12641" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/challenge/default.aspx">challenge</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/concatenation/default.aspx">concatenation</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>T-SQL Challenge: Grouped String Concatenation</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2009/02/27/t-sql-challenge-grouped-string-concatenation.aspx</link><pubDate>Fri, 27 Feb 2009 18:22:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:12307</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>42</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/12307.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=12307</wfw:commentRss><description>&lt;p&gt;It's been quite a while since the &lt;a href="http://sqlblog.com/blogs/adam_machanic/archive/2008/04/22/sql-server-query-processing-puzzle-like-vs.aspx"&gt;LIKE vs ? Puzzle&lt;/a&gt;,
and I feel like it's time for another one. Response was overwhelming
last time, and I'm back with a much tougher puzzle and a much bigger
prize. So get ready, because I'm going to really make you stretch your
brain and your T-SQL skills for this one.&lt;br&gt;
&lt;/p&gt;&lt;p&gt;But first, a bit of background. String concatenation is &lt;a href="http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/rowset-string-concatenation-which-method-is-best.aspx"&gt;something I've talked about on this blog before&lt;/a&gt;,
and it is an incredibly popular topic; my post on the subject has
gotten more hits than any other single post I've ever done. TechNet
blogger Ward Pond also understands the popularity of the topic, having &lt;a href="http://blogs.technet.com/wardpond/archive/2008/03/13/database-programming-the-string-concatenation-xml-trick.aspx"&gt;discussed&lt;/a&gt; &lt;a href="http://blogs.technet.com/wardpond/archive/2008/03/15/database-programming-the-string-concatenation-xml-trick-revisited-or-adam-is-right-but-we-can-fix-it.aspx"&gt;concatenation&lt;/a&gt; at &lt;a href="http://blogs.technet.com/wardpond/archive/2008/03/18/database-programming-the-string-concatenation-xml-trick-sans-entitization.aspx"&gt;least&lt;/a&gt; &lt;a href="http://blogs.technet.com/wardpond/archive/2008/03/21/database-programming-the-string-concatenation-xml-trick-finalized.aspx"&gt;five&lt;/a&gt; &lt;a href="http://blogs.technet.com/wardpond/archive/2009/02/26/database-programming-the-string-concatenation-xml-trick-revisited.aspx"&gt;times&lt;/a&gt; on his blog. And as illustrated &lt;a href="http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/"&gt;in this excellent article by Anith Sen&lt;/a&gt;, there are a number of methods available to the intrepid SQL Server explorer; the techniques Ward and I show are just the tip of the iceberg. &lt;/p&gt;&lt;p&gt;And
while all of that is great, there is a deep and troublesome hidden
problem: Nowhere in my post, nor Anith's article, nor Ward's series,
will you find a technique that completely solves the concatenation
problem. The FOR XML PATH('') method--by far the most popular SQL
Server 2005 "trick" I see repeated over and over in these articles and
on forums--is a bit limiting. It doesn't help when we need to "group"
our string concatenation, i.e. concatenate strings for a number of key
values and return multiple rows in the result. And FOR XML PATH('')
also leaves us a bit flat if we need to return aggregated data with the
concatenated strings or--even more interesting--concatenate multiple
different columns in the same output. &lt;/p&gt;&lt;p&gt;Sure, there are ways to
solve this problem, but they usually require temp tables, user-defined
functions (CLR or otherwise), tables of numbers, cursors, or other
adjunct objects. And while some of these solutions are certainly
workable they lack the beauty of a single, self-contained solution. In
the interest of solving this problem I recently created a challenge for
myself: Figure out how to do "grouped" concatenation using nothing more
than a single T-SQL statement. No temp tables. No UDFs. No procedural
logic. &lt;/p&gt;&lt;p&gt;But rather than do the work all alone and simply post my
solution, I've decided to invite you to join me in the quest. Are you
up for it?&lt;br&gt;&lt;/p&gt;&lt;p&gt;Here are the rules of the game:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;You are
to create a single T-SQL statement that concatenates values from the
AdventureWorks (note: not AdventureWorks2008) Sales.SalesOrderHeader,
Sales.SalesOrderDetail, Production.Product, and Person.Contact tables. &lt;br&gt;
&lt;/li&gt;&lt;li&gt;The output should have the following columns, in the following order, and no other columns:&lt;br&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;CustomerID: The customer's CustomerID (this is the unique key in the output)&lt;br&gt;&lt;/li&gt;&lt;li&gt;FirstName: The customer's first name&lt;/li&gt;&lt;li&gt;LastName: The customer's last name&lt;br&gt;&lt;/li&gt;&lt;li&gt;OrderCount: Number of orders placed by the customer&lt;/li&gt;&lt;li&gt;TotalDollarAmount: Total dollar amount of all orders placed by the customer (based on the SalesOrderHeader.SubTotal column)&lt;/li&gt;&lt;li&gt;TotalProductQuantity: Total number of items purchased by the customer in all orders (based on SalesOrderDetail.OrderQty)&lt;br&gt;&lt;/li&gt;&lt;li&gt;OrderNumbers:
Comma-delimited list containing the order numbers
(SalesOrderHeader.SalesOrderNumber) for each of the orders placed by
the customer&lt;/li&gt;&lt;ul&gt;&lt;li&gt;The numbers within the list should be alphabetized. The list
should have neither leading nor trailing commas, and each element in
the list should be separated by a single comma with no spaces or other
white space beforeor after the comma &lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;ProductNames: Comma-delimited list containing the unique names of all products ordered by the customer in all orders&lt;/li&gt;&lt;ul&gt;&lt;li&gt;The names within the list should be alphabetized. The list should have neither leading nor trailing commas, and each
element in the list should be separated by a single comma with no
spaces or other white space beforeor after the comma&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;li&gt;No tables--permanent, temporary, or variable--are to be
created. No dynamic SQL is to be used. No user-defined functions,
views, or stored procedures are allowed. No variables may be declared.
To put it simply, no permanent or temporary objects of any kind, at any
scope, are to be explicitly created. No procedural statements of any
kind--cursors or control-of-flow--are allowed. This must be a
standalone statement in the AdventureWorks database; nothing more and
nothing less.&lt;br&gt;&lt;/li&gt;&lt;li&gt;Aside from the previous stipulation, any SQL
Server 2005 or 2008 feature is fair game. Documented or not, if it
ships with the product and can be used in a standalone T-SQL statement,
you can use it. If you do use a version-specific feature, please let me
know (especially if it's a SQL Server 2005 feature that's gone in
2008). Bonus points may be given for solutions that work on either
version, but I'll make that decision after reviewing the submissions.&lt;/li&gt;&lt;li&gt;Entries
will be judged first and foremost on correctness, then on a combination
of performance, readability, and ability to apply your technique as a
general pattern. &lt;br&gt;
&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Just to be absolutely clear: If your submission violates the
rules, outputs the wrong data, or does not precisely follow the output
guidelines listed above, it will be ignored. Last time I spent a lot of
energy going back and forth with people helping them get there, and I
just don't have the bandwidth to do that again. So double-check your
submission before you send it to me.&lt;/li&gt;&lt;li&gt;Make your submission readable or you will lose credit even if
performance is amazing. I don't appreciate looking at a mess, and it's
good for your career to learn how to write code that others can
maintain. Hint: Learn to indent your code properly; lack of indentation
is the biggest mistake I see people make with regard to readability.&lt;/li&gt;&lt;li&gt;Take your time. You have two weeks to work on this. I've
already come up with three different solutions that have vastly
different performance characteristics. Perhaps your first shot isn't
the best choice?&lt;br&gt;
    &lt;/li&gt;&lt;/ul&gt;&lt;li&gt;The entry deadline is March 16, 2009, midnight GMT. No exceptions.&lt;/li&gt;&lt;li&gt;Submissions should be e-mailed to me, using a .SQL file attachment. &lt;br&gt;
&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Do not paste your solution into the body of your e-mail&lt;/li&gt;&lt;li&gt;The subject of your e-mail should be "Grouped String Challenge Submission". &lt;br&gt;
    &lt;/li&gt;&lt;li&gt;E-mail your submissions to [my first name] [at] [this site]. &lt;br&gt;
    &lt;/li&gt;&lt;li&gt;Again, be careful and don't violate these guidelines or your submission will be ignored.&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;... What's that? You want a prize? Fine, fine ...&lt;/li&gt;&lt;ul&gt;&lt;li&gt;The prize, for the best submission, is a full MSDN subscription, valued at around $10,000. How's that for inspiration?&lt;br&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;p&gt;... and that's that! One final note: Please &lt;b&gt;do not post your solution&lt;/b&gt;
in the comments here or on another blog, before the deadline has been
reached! Last time several people did that and it was incredibly
annoying both for me and those contestants trying to think through the
problem. You won't be doing yourself any favors by trying to mess up
the competition.&lt;/p&gt;&lt;p&gt;Once the deadline is reached I will test all of
the submissions, tabulate the results, and post back here in early
April. I promise, I won't let the thing stagnate for months like I did
last time. &lt;/p&gt;&lt;p&gt;Have fun with it, be creative, and feel free to post comments here with any questions you might have. I found this to be a fairly difficult but very interesting exercise and I hope you agree. Enjoy, and I'm looking forward to seeing what you can do!&lt;br&gt;
&lt;/p&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=12307" width="1" height="1"&gt;</description><enclosure url="http://www2.sqlblog.com/blogs/adam_machanic/attachment/12307.ashx" length="1015924" type="application/zip" /><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/challenge/default.aspx">challenge</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/concatenation/default.aspx">concatenation</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/contest/default.aspx">contest</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/grouping/default.aspx">grouping</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/puzzle/default.aspx">puzzle</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Who's On First? Solving the Top per Group Problem (Part 1: Technique)</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2008/02/08/who-s-on-first-solving-the-top-per-group-problem-part-1-technique.aspx</link><pubDate>Fri, 08 Feb 2008 23:09:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:4992</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>18</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/4992.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=4992</wfw:commentRss><description>&lt;p&gt;Relative comparison is a simple matter of human nature. From early childhood we compare and contrast what we see in the world around us, building a means by which to rate what we experience. And as it turns out, this desire to discover top and bottom, rightmost and leftmost, or best and worst happens to extend quite naturally into business scenarios. Which product is the top seller? How about the one that's simply not moving off the shelves? Which of our customers has placed the most expensive order? What are the most recent orders placed at each of our outlets?&lt;/p&gt;&lt;p&gt;In the world of common business questions, the edge cases are generally of most interest. What's in the middle is unimportant; it's often too difficult for the mind to compare and comprehend when there are hundreds, thousands, or even millions of items, transactions, or facts that are all within a similar range. Instead, we focus on those that stick out in some extraordinary way.&lt;/p&gt;&lt;p&gt;Those of us who work with SQL products on a regular basis are faced with solving this same problem time and again as we work through various business requirements. Over time, I have noticed four basic query patterns that can be used to solve the problem; each are logically equivalent (within certain restrictions -- more on that later), but can have surprisingly different performance characteristics depending on the data being queried. In this first post, I will outline the available patterns/methods. In the following posts, I will show the results of testing each pattern against a variety of scenarios in an attempt to discover where and when each should be used.&lt;/p&gt;&lt;p&gt;The four basic patterns are outlined below. Each of the methods is illustrated using a query to show all customers' names, plus their most recent order date, and the amount of that order. I've included notes that indicate where logic differences can arise among the various methods.&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Method 1: Join to full group and use correlated subquery with a MIN/MAX aggregate to filter&lt;/b&gt;&lt;/p&gt;&lt;p&gt;In this method we use an inner join to get all required columns, then filter the resultant set using a correlated subquery in the WHERE clause. &lt;br&gt;&lt;/p&gt;&lt;blockquote&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.FirstName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.LastName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderAmount&lt;br&gt;FROM Customers c&lt;br&gt;JOIN Orders o ON o.CustomerId = c.CustomerId&lt;br&gt;WHERE &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate&amp;nbsp; =&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT MAX(o1.OrderDate)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Orders o1&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o1.CustomerId = o.CustomerId&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; )&lt;br&gt;&lt;/blockquote&gt;&lt;p&gt;Logic notes: With this method ties are automatically included in the output, unless a tiebreaker is specified (which can be tricky given that you only have one column to work with). This method does not allow you to pull back an arbitrary number of rows, such as top 10 per customer; you are limited to the edge and any ties that might exist. &lt;b&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Method 1a: Join to full group and use correlated subquery with TOP(n) and ORDER BY to filter&lt;/b&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;This method is almost identical to Method 1 (which is why it is classified here as 1a), but the TOP and ORDER BY allow for a bit more flexibility than the aggregates.&lt;br&gt;&lt;/p&gt;&lt;blockquote&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.FirstName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.LastName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderAmount&lt;br&gt;FROM Customers c&lt;br&gt;JOIN Orders o ON o.CustomerId = c.CustomerId&lt;br&gt;WHERE &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate&amp;nbsp; =&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT TOP(1)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o1.OrderDate&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Orders o1&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o1.CustomerId = o.CustomerId&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o1.OrderDate DESC&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; )&lt;/blockquote&gt;&lt;p&gt;Logic notes: With this method you can more easily integrate a tiebreaker than with Method 1; the comparison column can be anything, including a primary key, and you can still order on whatever column makes most sense. In addition, you can take more rows than with Method 1 by using IN instead of = in the WHERE clause, and increasing the argument value to TOP.&lt;br&gt;&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Method 2: CROSS APPLY to ordered TOP(n)&lt;/b&gt;&lt;/p&gt;&lt;p&gt;In this method, SQL Server 2005's CROSS APPLY operator is used. This operator allows us to essentially create a table-valued correlated subquery -- something that impossible in previous versions of SQL Server. By using TOP in conjunction with ORDER BY we can get as many rows per group as needed.&lt;/p&gt;&lt;blockquote&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.FirstName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.LastName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x.OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x.OrderAmount&lt;br&gt;FROM Customers c&lt;br&gt;CROSS APPLY&lt;br&gt;(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT TOP(1)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderAmount&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Orders o&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.CustomerId = c.CustomerId&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate DESC&lt;br&gt;) x&lt;br&gt;&lt;/blockquote&gt;&lt;p&gt;Logic notes: This method is almost identical, from a logic point of view, with Method 1a modified to use IN on a primary key column. With both methods WITH TIES can be added to the TOP in order to get ties.&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Method 3: Join to derived table that uses a partitioned, ordered windowing function, and filter in the outer query based on the row number&lt;/b&gt;&lt;/p&gt;&lt;p&gt;In this method a derived table or CTE is used, in conjunction with a windowing function partitioned based on the required grain of the final query. So for the "most recent order per customer" query, the row number is partitioned based on the customer. This gives us a count starting at 1 for each customer, which can be filtered in the outer query.&lt;br&gt;&lt;/p&gt;&lt;blockquote&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.FirstName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.LastName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x.OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x.OrderAmount&lt;br&gt;FROM Customers c&lt;br&gt;INNER JOIN&lt;br&gt;(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.OrderAmount,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.CustomerId,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARTITION BY o.CustomerId&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY o.OrderDate DESC&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ) AS r&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Orders o&lt;br&gt;) x ON&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x.CustomerId = c.CustomerId&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND x.r = 1&lt;br&gt;&lt;/blockquote&gt;&lt;p&gt;Logic notes: If ties are important, use DENSE_RANK instead of ROW_NUMBER. ROW_NUMBER is good for arbitrary TOP(n), similar to Method 2. Unlike the previously described methods, in conjunction with DENSE_RANK this method can return an arbitrary TOP(n) rows, all of which can include ties. So if you would like to see the three most recent order dates and each happens to have multiple orders, this method will be able to return them all by simply filtering on x.r = 3. This would not be directly possible with any of the other methods described here.&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Method 4: "Carry-along sort"&lt;/b&gt;&lt;/p&gt;&lt;p&gt;This is the only "tricky" method, and not one that I recommend using, except as a last resort. I'm including it here only for completeness and comparison because it happens to be a very high performance method in some cases. This method involves converting each of the required inner columns into a string, concatenating them, then applying an aggregate to the string as a whole. By putting the "sort" column first, the other data is "carried along" -- thus the name for the method. The concatenated data is then "unpacked" in the outer query. This can be surprisingly efficient from an I/O standpoint, but the resultant code is a maintenance nightmare and it is quite easy to introduce errors. In addition, this method can only return the top 1 per group -- no ties or multiple return items are supported.&lt;br&gt;&lt;/p&gt;&lt;blockquote&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.FirstName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; c.LastName,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(DATETIME, SUBSTRING(x.OrderInfo, 1, 8)) AS OrderDate,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(MONEY, SUBSTRING(x.OrderInfo, 9, 15)) AS OrderAmount&lt;br&gt;FROM Customers c&lt;br&gt;INNER JOIN&lt;br&gt;(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.CustomerId,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MAX&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(CHAR(8), OrderDate, 112) +&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(CHAR(15), SubTotal)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ) OrderInfo&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Orders o&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; GROUP BY&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; o.CustomerId&lt;br&gt;) x ON&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; x.CustomerId = c.CustomerId&lt;br&gt;&lt;/blockquote&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;This post is just the beginning; watch this space in the coming days for a series of performance tests and analysis of these methods, and some results that I personally found to be quite surprising.&lt;br&gt;&amp;nbsp;&lt;/p&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=4992" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/logic/default.aspx">logic</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Performance/default.aspx">Performance</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Query+Tuning/default.aspx">Query Tuning</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Medians, ROW_NUMBERs, and performance</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/12/18/medians-row-numbers-and-performance.aspx</link><pubDate>Mon, 18 Dec 2006 19:52:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:437</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>8</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/437.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=437</wfw:commentRss><description>A couple of days ago, Aaron Bertrand posted about &lt;a href="http://sqlblog.com/blogs/aaron_bertrand/archive/2006/12/15/428.aspx"&gt;a method for calculating medians in SQL Server 2005&lt;/a&gt; using the ROW_NUMBER function in conjunction with the COUNT aggregate. This method (credited to Itzik Ben-Gan) is interesting, but I discovered an even better way to attack the problem in &lt;a href="http://www.amazon.com/Celkos-Analytics-Kaufmann-Management-Systems/dp/0123695120/sr=8-1/qid=1166482464/ref=sr_1_1/105-6595410-7450029?ie=UTF8&amp;amp;s=books"&gt;Joe Celko's Analytics and OLAP in SQL&lt;/a&gt;.&lt;br&gt;&lt;br&gt;Rather than using a COUNT aggregate in conjunction with the ROW_NUMBER function, Celko's method uses ROW_NUMBER twice: Once with an ascending sort, and again with a descending sort. The output rows can then be matched based on the ascending row number being within +/- 1 of the descending row number.&amp;nbsp; This becomes clearer with a couple of small examples:&lt;br&gt;&lt;br&gt;

&lt;table class="MsoTableGrid" style="border:medium none;border-collapse:collapse;" cellpadding="0" cellspacing="0"&gt;
 &lt;tr&gt;
  &lt;td style="border:1pt solid windowtext;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;A&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:solid solid solid none;border-color:windowtext windowtext windowtext -moz-use-text-color;border-width:1pt 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;1&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:solid solid solid none;border-color:windowtext windowtext windowtext -moz-use-text-color;border-width:1pt 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;4&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;B&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;2&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;3&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;C&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;3&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;2&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;D&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;4&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;1&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
&lt;/table&gt;

&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/p&gt;

&lt;table class="MsoTableGrid" style="border:medium none;border-collapse:collapse;" cellpadding="0" cellspacing="0"&gt;
 &lt;tr&gt;
  &lt;td style="border:1pt solid windowtext;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;A&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:solid solid solid none;border-color:windowtext windowtext windowtext -moz-use-text-color;border-width:1pt 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;1&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:solid solid solid none;border-color:windowtext windowtext windowtext -moz-use-text-color;border-width:1pt 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;5&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;B&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;2&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;4&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;C&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;3&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;3&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;"&gt;
  &lt;p class="MsoNormal"&gt;D&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;4&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;"&gt;
  &lt;p class="MsoNormal"&gt;2&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr style="height:14.35pt;"&gt;
  &lt;td style="border-style:none solid solid;border-color:-moz-use-text-color windowtext windowtext;border-width:medium 1pt 1pt;padding:0in 5.4pt;width:23.4pt;height:14.35pt;"&gt;
  &lt;p class="MsoNormal"&gt;E&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;height:14.35pt;"&gt;
  &lt;p class="MsoNormal"&gt;5&lt;/p&gt;
  &lt;/td&gt;
  &lt;td style="border-style:none solid solid none;border-color:-moz-use-text-color windowtext windowtext -moz-use-text-color;border-width:medium 1pt 1pt medium;padding:0in 5.4pt;width:27pt;height:14.35pt;"&gt;
  &lt;p class="MsoNormal"&gt;1&lt;/p&gt;
  &lt;/td&gt;
 &lt;/tr&gt;
&lt;/table&gt;

&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;br&gt;&lt;/o:p&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;In the first table (even number of rows), the median rows are B and C. These can be matched based on [Ascending Column] IN ([Descending Column] + 1, [Descending Column] - 1). In the second table (odd number of rows), the median row is C, which is matched where [Ascending Column] = [Descending Column]. Note that in the second table, the match criteria &lt;/o:p&gt;for the first table does not apply -- so the generic expression to match either case is the combination of the two:&amp;nbsp; [Ascending Column] IN ([Descending Column], [Descending Column] + 1, [Descending Column] - 1).&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;We can apply this logic within the AdventureWorks database to find the median of the "TotalDue" amount in the Sales.SalesOrderHeader table, for each customer:&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;div class="code"&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp; CustomerId,&lt;br&gt;&amp;nbsp;&amp;nbsp; AVG(TotalDue)&lt;br&gt;FROM&lt;br&gt;(&lt;br&gt;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CustomerId,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; TotalDue,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARTITION BY CustomerId &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY TotalDue ASC, SalesOrderId ASC) AS RowAsc,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARTITION BY CustomerId &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY TotalDue DESC, SalesOrderId DESC) AS RowDesc&lt;br&gt;&amp;nbsp;&amp;nbsp; FROM Sales.SalesOrderHeader SOH&lt;br&gt;) x&lt;br&gt;WHERE &lt;br&gt;&amp;nbsp;&amp;nbsp; RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1)&lt;br&gt;GROUP BY CustomerId&lt;br&gt;ORDER BY CustomerId;&lt;/div&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;The equivalent logic using Itzik Ben-Gan's method follows:&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;div class="code"&gt;&lt;br&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp; CustomerId,&lt;br&gt;&amp;nbsp;&amp;nbsp; AVG(TotalDue)&lt;br&gt;FROM&lt;br&gt;(&lt;br&gt;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CustomerId,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; TotalDue,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARTITION BY CustomerId&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY TotalDue) AS RowNum,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; COUNT(*) OVER (&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PARTITION BY CustomerId) AS RowCnt&lt;br&gt;&amp;nbsp;&amp;nbsp; FROM Sales.SalesOrderHeader&lt;br&gt;) x&lt;br&gt;WHERE&lt;br&gt;&amp;nbsp;&amp;nbsp; RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)&lt;br&gt;GROUP BY CustomerId&lt;br&gt;ORDER BY CustomerId;&lt;/div&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;Taking a look at the estimated execution plans for these two queries, we might believe that Ben-Gan's method is superior: Celko's algorithm requires an expensive intermediate sort operation and has an estimated cost of 4.96, compared to 3.96 for Ben-Gan's. &lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;Remember that these are merely estimates. And as it turns out, this is one of those times that the Query Optimizer's cost estimates are are totally out of line with the reality of what
happens when you actually run the queries. Although the performance
difference is not especially noticeable on a set of data as small as
that in Sales.SalesOrderHeader, check out the STATISTICS IO output. Celko's version does 703 logical reads; Ben-Gan's does an astonishing 140110!&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;There is a good lesson to be learned from this: &lt;i&gt;Cost-based optimization is far from perfect!&lt;/i&gt; Never completely trust what estimates tell you; they've come a long way, but clearly there is still some work to do in this area. The only way to actually determine that one query is better than another is to run it against a realistic set of data and look at how much IO and CPU time is actually used.&lt;br&gt;&lt;/p&gt;&lt;br&gt;In this case, Ben-Gan's query probably should perform better than Celko's. It seems odd that the Query Processor can't collect the row counts at the same time it processes the row numbers. Regardless, as of today this is the best way to solve this problem... Not that I've ever needed a median in any production application I've worked on. But I suppose that's beside the point!&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=437" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Performance/default.aspx">Performance</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Query+Tuning/default.aspx">Query Tuning</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Scalar functions, inlining, and performance: An entertaining title for a boring post</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/08/04/scalar-functions-inlining-and-performance-an-entertaining-title-for-a-boring-post.aspx</link><pubDate>Fri, 04 Aug 2006 03:07:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:146</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>30</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/146.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=146</wfw:commentRss><description>Scalar.&amp;nbsp; Function.&lt;br&gt;&lt;br&gt;Wow.&lt;br&gt;&lt;br&gt;Could any other combination of words evoke the same feeling of encapsulation, information hiding, and simplification of client code?&amp;nbsp; After years spent developing software in the procedural and OO worlds, it can be difficult--perhaps, even impossible--to migrate over to working with SQL Server and not consider how to architect your data access logic using some of the same techniques you'd use in the application tier.&lt;br&gt;&lt;br&gt;In short: Why would you &lt;i&gt;ever&lt;/i&gt; write the same piece of logic more than once?&amp;nbsp; Answer: &lt;i&gt;You wouldn't (damn it!)&lt;/i&gt;.&amp;nbsp; And so Microsoft bestowed upon the SQL Server community, in SQL Server 2000, the ability to write scalar user-defined functions.&amp;nbsp; And they could have been such beautiful things...&lt;br&gt;&lt;br&gt;But alas, reality can be painful, and as developers tried these new tools they were struck with a strange feeling of sadness as their applications buckled under the weight of what otherwise would have been a wonderful idea. As it turned out, putting all but the simplest of logic into these scalar functions was a recipe for disaster. Why?&amp;nbsp; Because they're essentially cursors waiting to happen (but they don't &lt;i&gt;look &lt;/i&gt;like cursors, so you may not know... until it's too late.)&lt;br&gt;&lt;br&gt;The central problem is that when you wrap logic in a multistatement UDF, the query optimizer just can't unwrap it too easily. And so there's really only one way to evaluate a scalar UDF: call it once per row. And that is really nothing more than a cursor.&lt;br&gt;&lt;br&gt;Seeing this behavior in action is easy enough; consider the following scalar function that some poor sap DBA working for AdventureWorks might be compelled to create:&lt;br&gt;&lt;br&gt;
&lt;pre class="code"&gt;CREATE FUNCTION GetMaxProductQty_Scalar&lt;br&gt;(&lt;br&gt;    @ProductId INT&lt;br&gt;)&lt;br&gt;RETURNS INT&lt;br&gt;AS&lt;br&gt;BEGIN&lt;br&gt;    DECLARE @maxQty INT&lt;br&gt;&lt;br&gt;    SELECT @maxQty = MAX(sod.OrderQty)&lt;br&gt;    FROM Sales.SalesOrderDetail sod&lt;br&gt;    WHERE sod.ProductId = @ProductId&lt;br&gt;&lt;br&gt;    RETURN (@maxQty)&lt;br&gt;END&lt;br&gt;&lt;/pre&gt;
&lt;br&gt;Simple enough, right?&amp;nbsp; Let's pretend that AdventureWorks has a bunch of reports, each of which requires maximum quantity sold per product.&amp;nbsp; So the DBA, thinking he can save himself some time and keep everything centralized (and that is a good idea), puts all of the logic into a scalar UDF.&amp;nbsp; Now, when he needs this logic, he can just call the UDF.&amp;nbsp; And if the logic has a bug, or needs to be changed, he can change it in exactly &lt;i&gt;one&lt;/i&gt; place.&amp;nbsp; And so life is great... Or is it?&lt;br&gt;&lt;br&gt;Let's take a look at a sample query:&lt;br&gt;&lt;br&gt;
&lt;pre class="code"&gt;SELECT&lt;br&gt;    ProductId,&lt;br&gt;    dbo.GetMaxProductQty_Scalar(ProductId)&lt;br&gt;FROM Production.Product&lt;br&gt;ORDER BY ProductId&lt;br&gt;&lt;/pre&gt;
&lt;br&gt;This query does nothing more than get the max quantity sold for each product in the Productin.Product table. And a look at the execution plan or the STATISTICS IO output might indicate that there's nothing too interesting going on here: The execution plan shows an index scan (to be expected, with no WHERE clause), followed by a compute scalar operation (the call to the UDF). And STATISTICS IO shows a mere 16 reads.&lt;br&gt;&lt;br&gt;So why is this query so problematic? Because the real issue is hiding just beneath the surface.&amp;nbsp;&lt;i&gt; The execution plan and STATISTICS IO didn't consider any of the code evaluated within the UDF!&lt;/i&gt; To see what's &lt;i&gt;really&lt;/i&gt; going on, fire up SQL Server Profiler, turn on the SQL:BatchCompleted event, and make sure you're showing the Reads column. Now run the query again and you'll see that this seemingly-innocent block of T-SQL is, in fact, using 365,247 logical reads. Quite a difference!&lt;br&gt;&lt;br&gt;Each of those "compute scalar" operations is really a call to the UDF, and each of the calls to the UDF is really a new query.&amp;nbsp; And all of those queries (all 504 of them -- the number of products in the Product table) add up to massive I/O.&amp;nbsp; Clearly not a good idea in a production environment.&lt;br&gt;&lt;br&gt;But luckily, we're not done here yet (or this would be a very boring post). Because while the performance penalty is a major turnoff, I really do love the encapsulation afforded by scalar UDFs.&amp;nbsp; I want them (or a similar tool) in my toolbox... And so I got to thinking.&lt;br&gt;&lt;br&gt;The answer to my dilemma, as it turns out, is to not use scalar UDFs at all, but rather to use &lt;i&gt;inline table-valued&lt;/i&gt; UDFs and treat them like scalars. This means that queries get slightly more complex than with scalar UDFs, but because the funtions are inlined (treated like macros) they're optimized along with the rest of the query. Which means, no more under-the-cover cursors.&lt;br&gt;&lt;br&gt;Following is a modified version of the scalar UDF posted above:&lt;br&gt;&lt;br&gt;
&lt;pre class="code"&gt;CREATE FUNCTION GetMaxProductQty_Inline&lt;br&gt;(&lt;br&gt;    @ProductId INT&lt;br&gt;)&lt;br&gt;RETURNS TABLE&lt;br&gt;AS&lt;br&gt;    RETURN&lt;br&gt;    (&lt;br&gt;        SELECT MAX(sod.OrderQty) AS maxqty&lt;br&gt;        FROM Sales.SalesOrderDetail sod&lt;br&gt;        WHERE sod.ProductId = @ProductId&lt;br&gt;    )&lt;br&gt;&lt;/pre&gt;&lt;br&gt;This function is no longer actually scalar--in fact, it now returns a table. It just so happens that the table has exactly one column and exactly one row, and uses the same logic as the scalar UDF shown above. So it's still scalar enough for my purposes.&lt;br&gt;&lt;br&gt;The query shown above, used to retrieve the maximum quantity sold for each product, will not quite work with this UDF as-is. Trying to substitute in the new UDF will result in nothing more than a variant on an "object not found" error.&amp;nbsp; Instead, you need actually treat this function like&amp;nbsp; it returns a table (due to the fact that it does).&amp;nbsp; And that means, in this case, a &lt;i&gt;subquery&lt;/i&gt;:&lt;br&gt;&lt;br&gt;
&lt;pre class="code"&gt;SELECT&lt;br&gt;    ProductId,&lt;br&gt;    (&lt;br&gt;        SELECT MaxQty&lt;br&gt;        FROM dbo.GetMaxProductQty_Inline(ProductId)&lt;br&gt;    ) MaxQty&lt;br&gt;FROM Production.Product&lt;br&gt;ORDER BY ProductId&lt;br&gt;&lt;/pre&gt;&lt;br&gt;So there it is. We're now treating the table-valued UDF more or less just like a scalar UDF.&amp;nbsp; And the difference in I/O results is really quite astounding: 1267 logical reads in this case. Meaning that the scalar UDF solution is around 288 times more I/O intensive!&lt;br&gt;&lt;br&gt;The question being, of course, was it worth it? The whole thing could have been written as one query, without the need for any UDFs at all. And the final query in this case is quite a bit more complex than the previous version, in addition to the fact that the encapsulation breaks down to some degree by forcing the caller to have some knowledge of how the UDF actually works. But I do feel that this sacrifice is warranted in some cases. Although the "greatest quantity sold" example shown here is simplistic, imagine other situations in which the same code fragments or logic are used over and over, due to lack of a good way of standardizing and centralizing them.&amp;nbsp; I know I've seen that a lot in my work, and some examples I can think of have included complex logic that might very well have been easier to maintain in a UDF.&lt;br&gt;&lt;br&gt;This technique may not be perfect for every case, and it certainly has its tradeoffs. But it may be a useful trick to keep in the back of your mind for a rainy day in the data center when someone's scalar UDF solution starts breaking down and you need a fix that doesn't require a massive code rewrite.&lt;br&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=146" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Performance/default.aspx">Performance</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Query+Tuning/default.aspx">Query Tuning</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Stored procedures are not parameterized views</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/stored-procedures-are-not-parameterized-views.aspx</link><pubDate>Thu, 13 Jul 2006 01:54:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:109</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>6</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/109.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=109</wfw:commentRss><description>Peter van Ooijen over at CodeBetter.com &lt;a href="http://codebetter.com/blogs/peter.van.ooijen/archive/2006/05/29/145697.aspx"&gt;posted in his blog about some observations he had when working with stored procedures in a recent project&lt;/a&gt;.
What I found to be interesting about his post was his comment that a
stored procedure can be, "a view with parameters."&amp;nbsp; I've run into this
assertion before, and it's something I think needs some clarification
for a lot of developers. I do not feel that there is any real
similarity between stored procedures and views -- they are entirely
different types of objects in an SQL database, and should not be
considered forms of one another in any way. &lt;br&gt;&lt;br&gt;Following is an edited version of the response I left in Peter's blog; I thought it warranted its own post:&lt;br&gt;&lt;br&gt;Stored procedures are not -- and never can be -- "parameterized views". &amp;nbsp;A view 
in an SQL database can be treated the same as a table in virtually every 
context.&amp;nbsp; Consider:&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;font-family:Courier New;"&gt;SELECT * &lt;br&gt;FROM Tbl &lt;br&gt;&lt;/div&gt;&lt;br&gt;vs. &lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;font-family:Courier New;"&gt;SELECT * &lt;br&gt;FROM 
View &lt;br&gt;&lt;/div&gt;&lt;br&gt;One
of the great things about working with views and tables is that the
person querying the database does not need to know whether the base
object being queried is a view or a table. For the sake of writing SQL
queries, they are one and the same. Both a view and a table have
well-defined columns, with well-defined datatypes.&lt;br&gt;&lt;br&gt;These
assertions cannot be made for a stored procedure as compared with a
view.&amp;nbsp; A stored procedure is related to a view only in as much as both
are defined using SQL syntax. But beyond there, the two diverge into
completely different types of entities. First of all, consider:&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;"&gt;&lt;span style="font-family:Courier New;"&gt;SELECT * 
&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;FROM StoredProcedure &lt;/span&gt;&lt;br&gt;&lt;/div&gt;&lt;br&gt;This
will not work, and will only result in an "invalid object name"
exception. The reason? Stored procedures expose no explicit output
contract.&amp;nbsp; Thanks to conditional branching, dynamic SQL, and SELECT *,
a stored procedure can output vastly different results beween
invocations, or based on different input parameters. It is quite
possible to code a stored procedure that will output no result sets for
one set of input parameters, two result sets for another, and four for
another.&amp;nbsp; Or, it's possible to change the returned result sets, e.g. by
outputting different column names or datatypes. Please note, this is an
&lt;span style="font-style:italic;"&gt;extremely&lt;/span&gt; poor (and very
dangerous) coding habit to get into -- but the point is, it is
impossible to verify the output of a stored procedure for a given set
of input parameters without running it.&lt;br&gt;&lt;br&gt;Furthermore, a stored
procedure "late binds" to the base objects being queried.&amp;nbsp; This adds to
the difficulty in verifying the output of a stored procedure, and is
why you can create the following stored procedure without getting an
exception (until you try to run it, of course):&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;font-family:Courier New;"&gt;CREATE PROC XYZ &lt;br&gt;
AS &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT 
* &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM ThisTableDoesNotExist &lt;br&gt;
GO &lt;br&gt;&lt;/div&gt;
&lt;br&gt;These stored procedure behaviors are in stark 
contrast to the way views work.&amp;nbsp; Views provide a couple of means of verification: &lt;br&gt;&lt;ul&gt;&lt;li&gt;The output columns/data types can be verified, and bound to, before actually 
querying the view &lt;/li&gt;&lt;li&gt;A view can be "schema bound", meaning that the 
underlying base tables (or other views) which the view is based on cannot be 
changed, schema-wise, unless the view is dropped. &amp;nbsp; &lt;/li&gt;&lt;/ul&gt;For
the first point, simply query the INFORMATION_SCHEMA.COLUMNS or
sys.Columns views, and column information can be determined for a view
without having to query it.&lt;br&gt;&lt;br&gt;The second point adds to the first
in a few ways. Schema binding brings to views a certain sense of "early
binding," which as I mentioned is missing in stored procedures.
Although no view can be created if one of its base objects does not
exist, schema binding takes it one step further and guarantees that the
base objects used to create the view &lt;span style="font-style:italic;"&gt;must exist, and must not be changed&lt;/span&gt;,
for as long as the view is present in the database. This means that if
a schemabound view is created that outputs a certain set of columns
with certain datatypes, it is guaranteed to do so for as long as it
exists in the database -- in other words, its contract is bound to the
schema, and changes to other objects cannot affect it.&amp;nbsp; This is a
powerful guarantee, which stored procedures fail to make.&lt;br&gt;&lt;br&gt;So now
the question is, if a stored procedures isn't a parameterized view then
what is? The answer, as of SQL Server 2000 (and continuing in 2005), is
the table-valued UDF. &amp;nbsp;A table-valued UDF is parameterized, has an
explicit and verifyable output contract*, and can be schema bound. &amp;nbsp;If
you are looking to implement a solution that makes use of a form of
parameterized views, stored procedures are probably not the right
choice.&amp;nbsp; I think that table-valued UDFs are quite underused and deserve
a second (or first!) look from many T-SQL developers who may have
glossed over them in the past.&lt;br&gt;&lt;br&gt;*
Note: Unlike for a view, the column list for a table-valued UDF cannot
be queried from the INFORMATION_SCHEMA.COLUMNS table. The column list
is, however, available from sys.Columns.&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=109" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Architecture/default.aspx">Architecture</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>T-SQL Variables: Multiple Value Assignment</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/t-sql-variables-multiple-value-assignment.aspx</link><pubDate>Thu, 13 Jul 2006 01:53:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:108</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>6</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/108.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=108</wfw:commentRss><description>Tony Rogerson &lt;a href="http://sqlserverfaq.com/blogs/blogs/tonyrogerson/archive/2006/05/18/449.aspx"&gt;brings us an interesting blog post about T-SQL variable assignment and SET vs. SELECT&lt;/a&gt;.&amp;nbsp;
The issue?&amp;nbsp; With SELECT you can assign values to multiple variables
simultaneously.&amp;nbsp; But with SET, you can set up your assignment such that
you get an exception if more than one row is assigned to the variable.&amp;nbsp;
Both are desirable qualities... But unfortunately, as Tony shows us,
it's difficult to achieve both multiple assignment and getting the
exception thrown, at the same time.&amp;nbsp; Tony shows us a solution involving
checking for the number of rows affected after the assignment. Creative
and effective, but it still has an issue: Unlike with SET when it
throws an exception, with Tony's solution the variables will still have
been affected by the assignment.&lt;br&gt;&lt;br&gt;As I was reading Tony's post, I
couldn't help but think that there must be another way.&amp;nbsp; And low and
behold, there is -- at least, in SQL Server 2005.&amp;nbsp; Thanks to the power
of windowed aggregates we can have our multiple pieces of cake and eat
them, all at the same time. Wonderful stuff.&lt;br&gt;&lt;br&gt;So, here's what you
do: Set up a CTE that selects the columns you'd like to assign to your
variables, and also get COUNT(*), partitioned by 1 (or some other
arbitrary literal). By partitioning by a literal, we will end up with
the row count for the entire set. In the outer query, express the
assignments from the columns returned by the CTE, but add an additional
WHERE clause that compares the value of the COUNT(*) column with a
subquery against a table of numbers. In the following example which
I've adapted from Tony's blog, I'm using master..spt_values for the
numbers, but you are encouraged to use a properly-indexed table of
numbers, should you decide to use this technique:&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;"&gt;&lt;span style="font-family:Courier New;"&gt;DECLARE &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; @reserved INT,&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; @rowcnt INT,&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; @used INT&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;SET @reserved = -1&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;SET @rowcnt = -1&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;SET @used = -1&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;;WITH x AS&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;(&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; reserved,&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; rowcnt,&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; used, &lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; COUNT(*) OVER(PARTITION BY 1) AS theCount&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM sysobjects so&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; INNER JOIN sysindexes si ON si.id = so.id&lt;br&gt;&lt;/span&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; so.name = 'sysrowsets'&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;)&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;SELECT&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; @reserved = reserved,&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; @rowcnt = rowcnt,&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; @used = used&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;FROM x&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;WHERE theCount = &lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; number&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM master..spt_values&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; TYPE = 'p'&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; AND number BETWEEN 1 AND theCount&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; )&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;SELECT @reserved, @rowcnt, @used&lt;/span&gt;&lt;br&gt;&lt;/div&gt;&lt;p&gt;As you'll see if you run this on your end, an exception is thrown and the values of the variables are not affected.&amp;nbsp; This works because the subquery used in the WHERE clause will return more than one value if theCount is greater than 1, thereby violating the rule that subqueries must only return one value.&lt;br&gt;&lt;br&gt;The price you'll pay for this convenience?&amp;nbsp; Extremely complex code for a simple variable assignment, in addition to a slight performance penalty.&amp;nbsp; Is it worth it?&amp;nbsp; Probably not, at least for me.&amp;nbsp; To be honest, I seriously doubt I will ever use this -- I've never been especially concerned with the chance of multiple rows screwing up my variable assignment, and those times that it has happened, I've remedied the situation other ways (e.g., defining a better primary key). That said, I think this was an interesting T-SQL challenge, and if anyone comes up with a more elegant solution than Tony's or mine, I'd love to see it!&lt;br&gt;&amp;nbsp;&lt;/p&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=108" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Running sums, redux</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/running-sums-redux.aspx</link><pubDate>Thu, 13 Jul 2006 01:51:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:106</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>9</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/106.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=106</wfw:commentRss><description>Siddhartha Gautama, the Buddha, taught us to understand that the key to
enlightenment is following the Middle Path.&amp;nbsp; And today I learned a
valuable lesson in extremes.&amp;nbsp; You can file this one in the "Doh!&amp;nbsp; Wrong again!" category...&lt;br&gt;
&lt;br&gt;
A fairly common question on SQL Server forums is, "how can I get the
running sum of the data in this column?"&amp;nbsp; Being the fan of set-based
queries that I am, I always answer the exact same way.&amp;nbsp; I show the
person asking the question how to do a self-join on the grouped column,
getting all of the "previous" values to create a running sum.&amp;nbsp; The
following example shows how you might do this against the
AdventureWorks Production.TransactionHistory table:&lt;br&gt;
&lt;br&gt;
&lt;blockquote&gt;&lt;font face="Courier New"&gt;SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; TH1.TransactionID,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; TH1.ActualCost,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SUM(TH2.ActualCost) AS RunningTotal&lt;br&gt;
FROM Production.TransactionHistory TH1&lt;br&gt;
JOIN Production.TransactionHistory TH2 ON TH2.TransactionID &amp;lt;= TH1.TransactionID&lt;br&gt;
GROUP BY TH1.TransactionID, TH1.ActualCost&lt;br&gt;
ORDER BY TH1.TransactionID&lt;br&gt;
  &lt;/font&gt;&lt;br&gt;
&lt;/blockquote&gt;
Pretty simple query.&amp;nbsp; For each row of the "TH1" alias, every row with a
lesser-or-equal TransactionID will be summed.&amp;nbsp; Thereby creating a
running total for every row of the table.&amp;nbsp; Note, I've used the IDENTITY
column instead of the date column.&amp;nbsp; I'd generally suggest not doing so
because, e.g., you might need to insert some post-dated rows at some
point and relying on the IDENTITY for a time sequence will thereby not
work.&amp;nbsp; But in this case it's a lazy solution because the
TransactionDate column isn't indexed, and it's also not unique.&amp;nbsp; I want
to test a lot of rows (TransactionHistory has around 113,000), but I
don't want to skew the test by forcing a table scan on every iteration!&lt;br&gt;
&lt;br&gt;
But I digress.&amp;nbsp; The point is, I've given this answer more than a few
times and, well, I'd like to apologize.&amp;nbsp; Just now I went ahead and ran
this query on my powerful test server--err, my laptop.&amp;nbsp; &lt;br&gt;
&lt;br&gt;
As you might guess, since I'm performance-minded I also happen to be
extremely impatient--so I went ahead and killed the query at the
five-minute mark.&amp;nbsp; SSMS's result grid showed the first 26,568 rows, so
obviously there was a long way to go to hit that 113,000 mark.&amp;nbsp; And
with an estimated cost of 38,086 for the query, I can't say I'm
surprised.&lt;br&gt;
&lt;br&gt;
A few moments of head scratching and the following re-write was issued:&lt;br&gt;
&lt;br&gt;
&lt;blockquote&gt;&lt;font face="Courier New"&gt;SELECT &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; TH1.TransactionID,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; TH1.ActualCost,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT SUM(TH2.ActualCost) &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Production.TransactionHistory TH2 &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE TH2.TransactionID &amp;lt;= TH1.TransactionID&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ) AS RunningTotal&lt;br&gt;
FROM Production.TransactionHistory TH1&lt;br&gt;
ORDER BY TH1.TransactionID&lt;br&gt;
  &lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;
With an estimated cost of only 6,630, I had high hopes for this one.&amp;nbsp;
Alas, once again I was forced to cancel the query at the five-minute
mark.&amp;nbsp; 27,683 rows.&amp;nbsp; Not much better, I'm afraid.&amp;nbsp; And, as an aside,
I'm starting to wonder about these estimated costs.&amp;nbsp; But that's another
post for another day.&lt;br&gt;
&lt;br&gt;
So where am I going with all of this?&amp;nbsp; Well, there's a reason I haven't
given any indication up until this point in the post.&amp;nbsp; You see, it's &lt;b&gt;utterly painful&lt;/b&gt; to write this, but...&lt;br&gt;
&lt;br&gt;
&lt;i&gt;In this case, a cursor is faster.&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
Yes, I said it.&amp;nbsp; That evil construct which we as database developers despise, the cursor.&amp;nbsp; Thanks to &lt;a href="http://www.sqlserverbible.com/"&gt;Paul Nielsen&lt;/a&gt;,
who revealed this ugly fact to me in a conversation today, I was forced
to test this for myself (hoping to prove him wrong, of course).&amp;nbsp; Which
is why I started playing around with the solution that I've given so
many times on forums.&amp;nbsp; Unfortunately, he is correct.&lt;br&gt;
&lt;br&gt;
My next test query, using the first cursor I've written in several years:&lt;br&gt;
&lt;br&gt;
&lt;blockquote&gt;&lt;font face="Courier New"&gt;DECLARE RunningTotalCursor&lt;br&gt;
CURSOR LOCAL FAST_FORWARD FOR&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT TransactionID, ActualCost&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Production.TransactionHistory&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY TransactionID&lt;br&gt;
  &lt;br&gt;
OPEN RunningTotalCursor&lt;br&gt;
  &lt;br&gt;
DECLARE @TransactionID INT&lt;br&gt;
DECLARE @ActualCost MONEY&lt;br&gt;
  &lt;br&gt;
DECLARE @RunningTotal MONEY&lt;br&gt;
SET @RunningTotal = 0&lt;br&gt;
  &lt;br&gt;
DECLARE @Results TABLE&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; TransactionID INT NOT NULL PRIMARY KEY,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ActualCost MONEY,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; RunningTotal MONEY&lt;br&gt;
)&lt;br&gt;
  &lt;br&gt;
FETCH NEXT FROM RunningTotalCursor&lt;br&gt;
INTO @TransactionID, @ActualCost&lt;br&gt;
  &lt;br&gt;
WHILE @@FETCH_STATUS = 0&lt;br&gt;
BEGIN&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SET @RunningTotal = @RunningTotal + @ActualCost&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; INSERT @Results&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; VALUES (@TransactionID, @ActualCost, @RunningTotal)&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FETCH NEXT FROM RunningTotalCursor&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; INTO @TransactionID, @ActualCost&lt;br&gt;
END&lt;br&gt;
  &lt;br&gt;
CLOSE RunningTotalCursor&lt;br&gt;
  &lt;br&gt;
DEALLOCATE RunningTotalCursor&lt;br&gt;
  &lt;br&gt;
SELECT *&lt;br&gt;
FROM @Results&lt;br&gt;
ORDER BY TransactionID&lt;br&gt;
  &lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;
What's really unfortunate about the cursor approach is that you need to
use a temporary table if you want to return a single result set to the
client. I figured the additional I/O due to the temp table would
balance any improvement gains from the cursor approach, thereby
rendering my forum responses correct, and Paul wrong.&amp;nbsp; Well, 14 seconds
and 113,443 rows later, SSMS and my laptop declared Paul the undisputed
Champion of the Cursor.&lt;br&gt;
&lt;br&gt;
This cursor makes a lot of sense in this case.&amp;nbsp; The set-based query
works by looping over each row of the table, taking a sum of every
previous row.&amp;nbsp; So for the 10th row, 10 previous rows also need to be
visited.&amp;nbsp; For the 1000th row, 1000 previous rows need to be visited.&amp;nbsp;
And so on.&amp;nbsp; The larger the set gets, the worse performance will be--and
that's not going to be a merely linear decrease in performance.&amp;nbsp; Think
about this:&amp;nbsp; Using the set-based method to find the running sum over a
set of 100 rows, 5050 total rows need to be visited.&amp;nbsp; For a set of 200
rows, the query processor needs to visit 20100 total rows -- a
four-fold increase in the amount of work that must be done to satisfy
the query (O((N^2)/2), for those who are a bit more algorithmically
minded.)&lt;br&gt;
&lt;br&gt;
The cursor, on the other hand, needs to visit each row exactly once
(O(N)). By maintaining the running count in a variable, there is no
need to re-visit previous rows.&amp;nbsp; And as my laptop was so happy to show
me, the I/O cost due to the temp table does not overshadow the
performance improvement of having to visit so many less rows.&lt;br&gt;
&lt;br&gt;
So what have we learned today?&amp;nbsp; In my set-based singlemindedness I
failed to realize that the cursor does, indeed, have utility.&amp;nbsp; &lt;b&gt;Everything in moderation.&lt;br&gt;
&lt;br&gt;
&lt;/b&gt;Next steps?&amp;nbsp; I get the feeling that this can be made even faster by
employing a CLR routine.&amp;nbsp; Pull the data into a DataReader and loop over
that instead, which will completely eliminate the need for a temporary
table.&amp;nbsp; Watch for that experiment coming to this space soon.&lt;br&gt;
&lt;br&gt;
And next time you hear someone mention how horrible cursors are, remind
that person that there is a time and place for everything (&lt;a href="http://www.planearium2.de/scripts-204.htm"&gt;and it's called college&lt;/a&gt;).&lt;br&gt;
&lt;br&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=106" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Performance/default.aspx">Performance</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Swinging From Tree to Tree Using CTEs, Part 2: Adjacency to Nested Intervals</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/swinging-from-tree-to-tree-using-ctes-part-2-adjacency-to-nested-intervals.aspx</link><pubDate>Thu, 13 Jul 2006 01:49:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:105</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>4</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/105.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=105</wfw:commentRss><description>In our &lt;a href="http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/swinging-from-tree-to-tree-using-ctes-part-1-adjacency-to-nested-sets.aspx"&gt;previous installment&lt;/a&gt;, we saw how to convert Adjacency Lists into Nested Sets using a CTE.&lt;br&gt;


&lt;br&gt;


In this episode, we will convert the Adjacency List into a &lt;a href="http://www.dbazine.com/oracle/or-articles/tropashko4"&gt;Nested Intervals&lt;/a&gt;
encoding.&amp;nbsp; Specifically, this encoding will make use of the Nested
Intervals with Continued Fractions technique that Tropashko presented
in a &lt;a href="http://arxiv.org/ftp/cs/papers/0402/0402051.pdf"&gt;later paper&lt;/a&gt;.&lt;br&gt;


&lt;br&gt;


The key to this technique lies in using a slightly different form of
materialized path than was used in the last post.&amp;nbsp; Rather than
materializing the EmployeeIds into a path, the path will be created as
an &lt;i&gt;enumerated&lt;/i&gt;
representation, based on sibling ordering.&amp;nbsp; For example, the first
two levels of the AdventureWorks HumanResources.Employee table's tree
look like:&lt;br&gt;


&lt;br&gt;


&lt;blockquote&gt;
EmployeeId 109 (CEO)&lt;br&gt;
  &lt;blockquote&gt;EmployeeId 6 (Marketing Manager)&lt;br&gt;
EmployeeId 12 (VP Engineering)&lt;br&gt;
EmployeeId 42 (IS Manager)&lt;br&gt;
EmployeeId 140 (CFO)&lt;br&gt;
EmployeeId 148 (VP Production)&lt;br&gt;
EmployeeId 273 (VP Sales)&lt;br&gt;
    &lt;br&gt;
  &lt;/blockquote&gt;
&lt;/blockquote&gt;


Using the materialized path representation from the previous post, the paths to the second-level employees would be:&lt;br&gt;


&lt;br&gt;


&lt;blockquote&gt;6: 109.6&lt;br&gt;
12: 109.12&lt;br&gt;
42: 109.42&lt;br&gt;
140: 109.140&lt;br&gt;
148: 109.148&lt;br&gt;
273: 109.273&lt;br&gt;
  &lt;br&gt;
&lt;/blockquote&gt;


However, for this post, paths will instead be materialized based on
sibling ordering.&amp;nbsp; We don't know anything about how siblings
should be ordered in the AdventureWorks employee hierarchy (that's
probably a business rule question, if it matters at all).&amp;nbsp; So
ordering will be done by EmployeeId:&lt;br&gt;


&lt;br&gt;


&lt;blockquote&gt;6: 1.1&lt;br&gt;
12: 1.2&lt;br&gt;
42: 1.3&lt;br&gt;
140: 1.4&lt;br&gt;
148: 1.5&lt;br&gt;
273: 1.6&lt;br&gt;
&lt;/blockquote&gt;


The significance of this encoding is that for each path a rational
number can be generated, using a Euclidian algorithm (described very
well on &lt;a href="http://www.mcs.surrey.ac.uk/Personal/R.Knott/Fibonacci/cfINTRO.html"&gt;this web site&lt;/a&gt;).&amp;nbsp;
The algorithm works by iterating over each element on the path,
building a rational number as it goes.&amp;nbsp; The beauty of this, as
shown by Tropashko in his paper, is that by using these paths we can
determine an interval, beween which all children of a given path will
fall.&lt;br&gt;


&lt;br&gt;

In order to accomplish getting the path, the ROW_NUMBER function
will be used, but slightly differently than last time.&amp;nbsp; Instead of
creating a second CTE to reference the first, thereby getting the row
number for each element of the path, the ROW_NUMBER function will be
embedded within the recursive CTE itself, as in the following example:&lt;br&gt;


&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeRows AS&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY EmployeeId) AS theRow&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY e.EmployeeId) AS theRow&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeRows x&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e ON e.ManagerId = x.EmployeeId&lt;br&gt;
)&lt;br&gt;
SELECT *&lt;br&gt;
FROM EmployeeRows&lt;br&gt;
ORDER BY &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ManagerId, &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId&lt;/font&gt;&lt;br&gt;
&lt;/blockquote&gt;


Interestingly, this example as-is will return &lt;i&gt;exactly the same results&lt;/i&gt; as the following, non-recursive CTE example:&lt;br&gt;


&lt;blockquote&gt;&lt;font face="Courier New"&gt;SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (PARTITION BY ManagerId ORDER BY EmployeeId) AS theRow&lt;br&gt;
FROM HumanResources.Employee&lt;br&gt;
ORDER BY &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ManagerId, &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;


So, why do we care?&amp;nbsp; Take a close look at the two examples.&amp;nbsp;
In the CTE example, the ROW_NUMBER function does not use PARTITION
BY.&amp;nbsp; Yet, results are &lt;i&gt;implicitly&lt;/i&gt; partitioned.&amp;nbsp; This
gives an interesting view into the inner-workings of CTEs.&amp;nbsp; As it
turns out, the recursive part of the CTE is called once per row
returned by the anchor or previous recursion.&amp;nbsp; This is not how I
originally expected CTEs to behave (I thought the recursive part would
be called once per rowset returned by the anchor or previous
recursion), but it does help us with this particular task!&lt;br&gt;


&lt;br&gt;


The current row number, at any given point in the recursion, represents
the enumeration for that node.&amp;nbsp; But because we're using a
recursive CTE, we also have access to the parent's enumaration.&amp;nbsp;
For instance, to build an enumerated path, the following T-SQL would be
used:&lt;br&gt;


&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeRows AS&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), ROW_NUMBER() OVER (ORDER BY EmployeeId)) AS thePath&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath + '.' +
CONVERT(VARCHAR(MAX), ROW_NUMBER() OVER (ORDER BY e.EmployeeId)) AS
thePath&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeRows x&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e ON e.ManagerId = x.EmployeeId&lt;br&gt;
)&lt;br&gt;
SELECT *&lt;br&gt;
FROM EmployeeRows&lt;br&gt;
ORDER BY &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; thePath&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;


Note that this sample can be made a bit more readable (and more
functional for later) by embeding the ROW_NUMBER in a derived table:&lt;br&gt;


&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeRows AS&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), y.theRow) AS thePath&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY EmployeeId) AS theRow&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ) y&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.thePath + '.' + CONVERT(VARCHAR(MAX), y.theRow) AS thePath&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.ManagerId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY e.EmployeeId) AS theRow&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeRows x&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e ON e.ManagerId = x.EmployeeId&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ) y&lt;br&gt;
)&lt;br&gt;
SELECT *&lt;br&gt;
FROM EmployeeRows&lt;br&gt;
ORDER BY &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; thePath&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;


From here, it's a simple step to implement the Euclidian algorithm.&amp;nbsp; The algorithm is quite simple:&lt;br&gt;


&lt;ol&gt;&lt;li&gt;Set parentNumerator &amp;lt;- 1&lt;/li&gt;&lt;li&gt;Set parentDenominator &amp;lt;- 0&lt;/li&gt;&lt;li&gt;Set theElement &amp;lt;- first enumeration in the path&lt;/li&gt;&lt;li&gt;Set currentNumerator &amp;lt;- theElement&lt;/li&gt;&lt;li&gt;Set currentDenominator &amp;lt;- 1&lt;/li&gt;&lt;li&gt;Set theElement &amp;lt;- next enumeration in the path&lt;/li&gt;&lt;li&gt;Set previousParentNumerator &amp;lt;- parentNumerator&lt;/li&gt;&lt;li&gt;Set previousParentDenominator &amp;lt;- parentDenominator&lt;/li&gt;&lt;li&gt;Set parentNumerator &amp;lt;- currentNumerator&lt;/li&gt;&lt;li&gt;Set parentDenominator &amp;lt;- currentDenominator&lt;/li&gt;&lt;li&gt;Set currentNumerator &amp;lt;- (parentNumerator * theElement) + previousParentNumerator&lt;/li&gt;&lt;li&gt;Set currentDenominator &amp;lt;- (parentDenominator * theElement) + previousParentDenominator&lt;/li&gt;&lt;li&gt;If the current element is not the final node in the path, goto 6.&lt;/li&gt;&lt;/ol&gt;

This seems a bit hairy, but I think that looking at the algorithm
and spending a few minutes with our old friends pencil and paper will
make it quite clear.&amp;nbsp; Also, look at the web site linked
above.&amp;nbsp; Here is how I've implemented the algorithm using a CTE:&lt;br&gt;


&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeRows AS&lt;br&gt;

(&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), theRow) AS thePath,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, 1) AS prevNumer,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, 0) AS prevDenom,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, theRow) AS currNumer,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, 1) AS currDenom&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY EmployeeId) AS theRow&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; ) y&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.EmployeeId,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.thePath + '.' + CONVERT(VARCHAR(MAX), y.theRow) AS thePath,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; prevNumer = y.currNumer,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; prevDenom = y.currDenom,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; (y.currNumer * y.theRow) + y.prevNumer&amp;nbsp; AS currNumer,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; (y.currDenom * y.theRow) + y.prevDenom&amp;nbsp; AS currDenom&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT &lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeID,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.currNumer,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.currDenom,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.prevNumer,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.prevDenom,&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY e.EmployeeID) AS therow&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeRows x&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e ON e.ManagerId = x.EmployeeId&lt;br&gt;

&amp;nbsp;&amp;nbsp;&amp;nbsp; ) y&lt;br&gt;

)&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;

Note that in this case I didn't require the use
of the temporary variables; the previous anchor/recursive parts act as
temporary storage enough.&amp;nbsp; &lt;br&gt;


&lt;br&gt;


Readers will also hopefully notice that I haven't yet included a SELECT
to get the data from the CTE!&amp;nbsp; This is because I'd like to explain
briefly what it will do.&amp;nbsp; In his paper, Tropashko explains that for
each node, the intervals for the children of that node will fall into
an interval between the encoding of that node (currNumer / currDenom)
and the numerator of the previous node plus the numerator for the
current node, divided by the denominator of the previous node plus the
denominator for the current node ((currNumer + prevNumer) / (currDenom
+ prevDenom)).&amp;nbsp; Quite wordy here.&amp;nbsp; Refer to the paper for a better
explanation and proof.&lt;br&gt;


&lt;blockquote&gt;
&lt;/blockquote&gt;



Anyway, the completed query follows:&lt;br&gt;


&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeRows AS&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), theRow) AS thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, 1) AS prevNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, 0) AS prevDenom,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, theRow) AS currNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(BIGINT, 1) AS currDenom&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY EmployeeId) AS theRow&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ) y&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; y.thePath + '.' + CONVERT(VARCHAR(MAX), y.theRow) AS thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; prevNumer = y.currNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; prevDenom = y.currDenom,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; (y.currNumer * y.theRow) + y.prevNumer&amp;nbsp; AS currNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; (y.currDenom * y.theRow) + y.prevDenom&amp;nbsp; AS currDenom&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeID,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.currNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.currDenom,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.prevNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.prevDenom,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (ORDER BY e.EmployeeID) AS therow&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeRows x&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e ON e.ManagerId = x.EmployeeId&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; ) y&lt;br&gt;
)&lt;br&gt;
SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; currNumer AS startNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; currDenom AS startDenom,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; currNumer + prevNumer AS endNumer,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; currDenom + prevDenom AS endDenom&lt;br&gt;
FROM EmployeeRows&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;


For each node (EmployeeId), you now have an interval (start and end)
within which all children intervals will fall.&amp;nbsp; Note that computation
must be done by the interval, not by the current node's encoding.&amp;nbsp; The
reason becomes apparent when looking at the encodings for 1.1.1 and
1.2.&amp;nbsp; They are the same; however, their intervals do not overlap.&amp;nbsp; As a
matter of fact, encodings will be the same for every next sibling/first
child pair.&amp;nbsp; But the intervals remain nested, and if proper queries are
written there will be no confusion.&lt;br&gt;


&lt;br&gt;


So that's a first step towards using the Nested Intervals Model in SQL

Server 2005.&amp;nbsp; Stay tuned for more... And as always, feel free to post
questions or comments.&amp;nbsp; I know some of this material can be confusing
(at least, it was to me before I wrote this post!)&lt;br&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=105" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Hierarchies/default.aspx">Hierarchies</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Swinging From Tree to Tree Using CTEs, Part 1: Adjacency to Nested Sets</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/swinging-from-tree-to-tree-using-ctes-part-1-adjacency-to-nested-sets.aspx</link><pubDate>Thu, 13 Jul 2006 01:47:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:103</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>19</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/103.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=103</wfw:commentRss><description>I'm not sure how many times over the last several years I've seen the
same tired article titles... "Climbing Trees in SQL," "Climbing Up the
SQL Tree," or maybe, "Naked Coeds Playing in the Trees!" ... Oh
wait, I think that last one might be something else.&lt;br&gt;

&lt;br&gt;

But anyway, the point is, I'm going to adhere to that standard.&amp;nbsp;
But I'm adding a Tarzan-esque theme to this post, because we're not
going to just climb a tree.&amp;nbsp; We're going to swing about between
trees.&amp;nbsp; Different types of tree representations are appropriate
for different scenarios.&amp;nbsp; Which is why, as I pointed out in my &lt;a href="http://searchsqlserver.techtarget.com/tip/1,289483,sid87_gci1107414,00.html?FromTaxonomy=/pr/301329"&gt;recent SearchSQLServer article on recursive Common Table Expressions&lt;/a&gt;,
we have so many different ways of representing them.&amp;nbsp; Adjacency
Lists, Materialized Paths, Nested Sets, and Nested Intervals spring to
mind.&amp;nbsp; And there are probably others.&lt;br&gt;

&lt;br&gt;

My article shows how to use the CTEs, in conjunction with a dynamically
generated materialized path, to manipulate an Adjacency List, getting
many of the benefits associated with using the Nested Sets Model.&amp;nbsp;
And that's great.&amp;nbsp; But the Nested Sets Model itself might be
useful in your endeavors.&amp;nbsp; So in this post, I will show how to
extend the CTE discussed in that article.&lt;br&gt;

&lt;br&gt;

The end-result CTE in the article, which can be run in the
AdventureWorks database, looks something like this (renamed for the
sake of this post):&lt;br&gt;

&lt;br&gt;

&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeLevels AS&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
(&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), EmployeeId) AS thePath,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; 1 AS Level&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
  &lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
  &lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeId,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath + '.' + CONVERT(VARCHAR(MAX), e.EmployeeId) AS thePath,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.Level + 1 AS Level&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeLevels x&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e on e.ManagerId = x.EmployeeId&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;
)&lt;/font&gt;&lt;br&gt;
&lt;/blockquote&gt;

thePath is the materialized path, a '.' delimited breadcrumb from the
root to each node.&amp;nbsp; We also end up with a level, which represents
how many nodes away from the root in the hierarchy each employee sits
(very important for those upwardly-mobile junior execs, no doubt!)&lt;br&gt;

&lt;br&gt;

I'm going to assume that readers of this post are familiar with the &lt;a href="http://www.intelligententerprise.com/001020/celko.jhtml?_requestid=235427"&gt;Nested Sets Model&lt;/a&gt;, as popularized by Joe Celko.&amp;nbsp; If not, read the link.&lt;br&gt;

&lt;br&gt;

After staring at the Nested Sets for a while, I arrived at the following mathematical properties:&lt;br&gt;

&lt;br&gt;

&lt;ul&gt;&lt;li&gt;Value of Lft for the root node is 1&lt;/li&gt;&lt;li&gt;Value of Rgt for the root node is 2 * (Number of nodes)&lt;/li&gt;&lt;li&gt;Value of Lft for any node is ((Number of nodes visited) * 2) - (Level of current node)&lt;/li&gt;&lt;li&gt;Value of Rgt for any node is (Lft value) + ((Number of subnodes) * 2) + 1&lt;/li&gt;&lt;/ul&gt;
I think the only factor here that requires further explanation is
"(Number of nodes visited)".&amp;nbsp; By this, I mean the number of
nodes that would have been visited (including the current node) if one
were doing a
preorder traversal of the tree. Luckily, this number is quite easy
to determine.&amp;nbsp; The row number for each row, as determined by
ordering by the materialized path, &lt;i&gt;is&lt;/i&gt;
this number.&amp;nbsp; I
encourage readers to validate this with pencil and paper if it doesn't
quite make sense reading on the screen.&amp;nbsp; Draw a simple tree and
traverse, starting from the lefthand side of the root node.&lt;br&gt;

&lt;br&gt;
But how to translate that into T-SQL?&amp;nbsp; Luckily, SQL Server 2005, in addition to CTEs, also includes the very useful &lt;a href="http://blogs.conchango.com/jamiethomson/archive/2005/02/16/1025.aspx"&gt;ROW_NUMBER() function&lt;/a&gt;.&amp;nbsp;
So to get the row number, we simply need to add another CTE into the
chain (did you know that successive CTEs can use the results of
previous CTEs?):&lt;br&gt;

&lt;br&gt;

&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeLevels AS&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;(&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), EmployeeId) AS thePath,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; 1 AS Level&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;/font&gt;&lt;br&gt;
  &lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;/font&gt;&lt;br&gt;
  &lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeId,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath + '.' + CONVERT(VARCHAR(MAX), e.EmployeeId) AS thePath,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.Level + 1 AS Level&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeLevels x&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e on e.ManagerId = x.EmployeeId&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;),&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;EmployeeRows AS&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;(&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;EmployeeLevels.*,&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ROW_NUMBER() OVER (ORDER BY thePath) AS Row&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeLevels&lt;/font&gt;&lt;br&gt;
  &lt;font face="Courier New"&gt;)&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;

We now have current level and number of nodes visited, which gives us
the Lft value for each node.&amp;nbsp; But how to determine the number of
subnodes, in order to get the Rgt value?&amp;nbsp; Luckily, the materialized
path also gives us that capability...&lt;br&gt;

&lt;br&gt;

For any given node, number of subnodes can be determined by counting
all nodes whose path value is LIKE the current path value + '.%'.&lt;br&gt;

&lt;br&gt;

The resultant query should be fairly obvious at this point:&lt;br&gt;

&lt;br&gt;

&lt;blockquote&gt;&lt;font face="Courier New"&gt;WITH EmployeeLevels AS&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; CONVERT(VARCHAR(MAX), EmployeeId) AS thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; 1 AS Level&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM HumanResources.Employee&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ManagerId IS NULL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; UNION ALL&lt;br&gt;
  &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; e.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.thePath + '.' + CONVERT(VARCHAR(MAX), e.EmployeeId) AS thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; x.Level + 1 AS Level&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeLevels x&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN HumanResources.Employee e on e.ManagerId = x.EmployeeId&lt;br&gt;
),&lt;br&gt;
EmployeeRows AS&lt;br&gt;
(&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;EmployeeLevels.*,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ROW_NUMBER() OVER (ORDER BY thePath) AS Row&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeLevels&lt;br&gt;
)&lt;br&gt;
SELECT&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ER.EmployeeId,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ER.thePath,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ER.Level,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ER.Row,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;(ER.Row * 2) - ER.Level AS Lft,&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;((ER.Row * 2) - ER.Level) + &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; (&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT COUNT(*) * 2&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; FROM EmployeeRows ER2 &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE ER2.thePath LIKE ER.thePath + '.%'&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; ) + 1 AS Rgt&lt;br&gt;
FROM EmployeeRows ER&lt;br&gt;
ORDER BY thePath&lt;br&gt;
  &lt;/font&gt;&lt;/blockquote&gt;

... And that's it!&amp;nbsp; We have now converted the Adjacency List tree into a Nested Sets tree.&lt;br&gt;

&lt;br&gt;

In the next installment, I will show how to determine the Nested
Intervals encoding from an Adjacency List tree, also using recursive
CTEs.&lt;br&gt;

&lt;br&gt;

Questions?&lt;br&gt;

&lt;br&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=103" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Hierarchies/default.aspx">Hierarchies</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Texas Hold 'em Hint #1</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/texas-hold-em-hint-1.aspx</link><pubDate>Thu, 13 Jul 2006 01:46:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:102</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>0</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/102.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=102</wfw:commentRss><description>The other day I annouced the Texas Hold 'em SQL Challenge.
I haven't gotten any feedback on it yet, so I have no idea if anyone is
working on it, but I thought I'd get the ball rolling and come up with
my own solution...
&lt;p&gt;The first step I've decided to take in solving this problem is
to figure out how, given two or more 5-card hands, to figure out a
winner. What's necessary is a way of weighting each hand so that better
hands will have a higher value. Or at least, that's the tactic I took.
&lt;/p&gt;&lt;p&gt;
I first checked out &lt;a href="http://boardgames.about.com/cs/poker/a/poker_hands.htm" target="#"&gt;this list of poker hands from strongest to weakest&lt;/a&gt;.  After staring at it for a while, I realized that some patterns can be derived from the list:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A two-pair is two pairs (wow, getting deep here, aren't we!)
&lt;/li&gt;&lt;li&gt;A full house is a pair plus a three of a kind
&lt;/li&gt;&lt;li&gt;A straight flush is a straight plus a flush
&lt;/li&gt;&lt;li&gt;A royal flush is a type of straight flush with the &lt;i&gt;highest cards in the deck&lt;/i&gt;
&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;Obvious, yes; but the final point is important -- is there a way of
weighting the cards such that we can determine, using only a single
number, the weight of a given hand, independent of whether or not it
falls into one of these categories? That is, how can we know that a
straight flush with a high king is worth less than a straight flush
with a high ace? And how can we know that a hand with a pair of 10s and
a high jack beats a hand with a pair of 10s and a high 9?
&lt;/p&gt;&lt;p&gt;
The answer is to use our old friend, binary:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;table&gt;
&lt;tr&gt;&lt;th&gt;Card&lt;/th&gt;&lt;th&gt;Value&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;32&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;64&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;128&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;256&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;512&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Jack&lt;/td&gt;&lt;td&gt;1024&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Queen&lt;/td&gt;&lt;td&gt;2048&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;King&lt;/td&gt;&lt;td&gt;4096&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ace&lt;/td&gt;&lt;td&gt;8192 (1)&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;
Note that the ace will be weighted as 1, should it happen to be part of a straight with a high 5.
&lt;/p&gt;&lt;p&gt;So what does this binary weighting give us? The property that's
especially useful in this case is that for every power of 2, it's
impossible to reach that number by adding up any unique combination of
powers of 2 less than that power. Restated in terms of this project, if
a hand has only a high card, say a king, then that hand will be
weighted higher than any other high card hand that doesn't have either
a king or an ace.
&lt;/p&gt;&lt;p&gt;
But weighting alone isn't enough.  This can be shown using the following two hands:
&lt;/p&gt;&lt;p&gt;
2 3 Q K A
&lt;br&gt;
2 3 Q K K
&lt;/p&gt;&lt;p&gt;The weighted values of these hands are 14342 and 10246,
respectively (2 + 4 + 2048 + 4096 + 8192 and 2 + 4 + 2048 + 4096 +
4096). However, the second is a better hand. How to solve this issue?
Bonuses.
&lt;/p&gt;&lt;p&gt;After conducting a completely non-scientific trial-and-error
analysis, I arrived at the following bonus structure, which seems to
work:
&lt;/p&gt;&lt;p&gt;To weight a hand, take the base weight (i.e., the sum of the
weights of each card in the hand) and add the following bonuses per
condition:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Pair: 2000000000 + (8192 * pair card weight)
&lt;/li&gt;&lt;li&gt;Three of a kind: 6000000000 + (131072 * three of a kind card weight)
&lt;/li&gt;&lt;li&gt;Straight: 5500000000
&lt;/li&gt;&lt;li&gt;Flush: 5800000000
&lt;/li&gt;&lt;li&gt;Four of a kind:  9500000000 + (131072 * four of a kind card weight)
&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;Note that "pair card weight," for example, refers to the weight of
the card type that makes up the pair. So for a pair of twos, the
calculation would be (8192 * 2).
&lt;/p&gt;&lt;p&gt;This formula can be expanded for each type of hand. Two pair?
Add 2000000000 + (8192 * the weight of the card for the first pair) +
(8192 * the weight of the card for the second pair). Full house?
Combine the pair with the three of a kind. Same with straight flush.
And since the base weights are being added, the hands will naturally be
weighted heavier for higher cards.
&lt;/p&gt;&lt;p&gt;
There may have been a more elegant solution, but this one does the trick for now.
&lt;/p&gt;&lt;p&gt;
But enough math, &lt;b&gt;on to the SQL&lt;/b&gt;! What I wanted was a
table containing every possible hand and its weights. But first I
needed a table with all of the possible cards:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="CODE"&gt;CREATE TABLE dbo.Cards&lt;br&gt;(&lt;br&gt;	CardID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,&lt;br&gt;	Suit CHAR(1) NOT NULL,&lt;br&gt;	Card VARCHAR(2) NOT NULL,&lt;br&gt;	Weight INT NOT NULL&lt;br&gt;)&lt;br&gt;&lt;br&gt;INSERT dbo.Cards (Suit, Card, Weight)&lt;br&gt;SELECT S.Suit, C.Card, C.Weight&lt;br&gt;FROM&lt;br&gt;(&lt;br&gt;	SELECT 'H'&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'D'&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'S'&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'C'&lt;br&gt;) S (Suit)&lt;br&gt;CROSS JOIN&lt;br&gt;(&lt;br&gt;	SELECT CONVERT(VARCHAR(2), number), POWER(2, number-1)&lt;br&gt;	FROM master..spt_values&lt;br&gt;	WHERE type='p'&lt;br&gt;		AND number BETWEEN 2 AND 10&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'J', 1024&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'Q', 2048&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'K', 4096&lt;br&gt;	UNION ALL&lt;br&gt;	SELECT 'A', 8192&lt;br&gt;) C (Card, Weight)&lt;br&gt;ORDER BY &lt;br&gt;	Suit, &lt;br&gt;	Weight&lt;br&gt;GO&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;Yes, I know that (Suit, Card) could have formed a natural primary
key, but the table of hands is going ta have 2598960 rows -- which you
can prove using the formulas from &lt;a href="http://mathforum.org/dr.math/faq/faq.comb.perm.html" target="#"&gt;this website&lt;/a&gt;:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="CODE"&gt;52 cards, 5 cards per hand:&lt;br&gt;&lt;br&gt;            n!&lt;br&gt;n_C_k = ----------&lt;br&gt;        k!(n - k)!&lt;br&gt;&lt;br&gt;      52!&lt;br&gt;= -----------&lt;br&gt;  5!(52 - 5)!&lt;br&gt;&lt;br&gt;      52!&lt;br&gt;= -----------&lt;br&gt;    5!(47)!&lt;br&gt;&lt;br&gt;  52 * 51 * 50 * 49 * 48&lt;br&gt;= ----------------------&lt;br&gt;     5 * 4 * 3 * 2 * 1&lt;br&gt;&lt;br&gt;   311875200&lt;br&gt;= -----------&lt;br&gt;      120&lt;br&gt;&lt;br&gt;= 2598960&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;
...And to believe that my high school algebra teacher had the gall to flunk me!
&lt;/p&gt;&lt;p&gt;
Anyway, with 2.59 million rows, I'd rather just use a single-column
surrogate. It makes like a lot easier. So how do we generate all of
these rows?
&lt;/p&gt;&lt;p&gt;
First, we need every possible hand:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="CODE"&gt;SELECT&lt;br&gt;	C1.CardId AS Card1,&lt;br&gt;	C2.CardId AS Card2,&lt;br&gt;	C3.CardId AS Card3,&lt;br&gt;	C4.CardId AS Card4,&lt;br&gt;	C5.CardId AS Card5&lt;br&gt;FROM dbo.Cards C1&lt;br&gt;JOIN dbo.Cards C2 ON &lt;br&gt;	C2.CardId &amp;gt; C1.CardId&lt;br&gt;JOIN dbo.Cards C3 ON &lt;br&gt;	C3.CardId &amp;gt; C2.CardId&lt;br&gt;JOIN dbo.Cards C4 ON&lt;br&gt;	C4.CardId &amp;gt; C3.CardID&lt;br&gt;JOIN dbo.Cards C5 ON&lt;br&gt;	C5.CardId &amp;gt; C4.CardId&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;
Easy enough.  And notice how nice and simple that surrogate makes life... &lt;a href="http://lazyway.blogs.com/lazy_way/" target="#"&gt;Laziness is a virtue!&lt;/a&gt;
&lt;/p&gt;&lt;p&gt;Next step, apply the weighting logic. This is quite a bit more
complex, and I'm not going to spend a lot of time going into detail
since it's currently quite late here and I need to work in the morning.
But remember the following rules: If you have two or more of the same
index in a hand (pair, three of a kind, four of a kind), that hand
can't be a flush. And if you have only two of a given card, that hand
can't have a three of a kind of that card. And if you have only three
of a card, that hand can't have a four of a kind for that card! &lt;/p&gt;&lt;p&gt;
Given these rules, I realized that you can test for all of these
conditions and add up the results, without doing any kind of
backtracking. This made the job quite a bit simpler. I did the
following: First, calculate the base weight and figure out whether we
have a straight, for any given row. Add up the results. Then "unpivot"
(using a cross-join to five rows of numbers) and use aggregates to
figure out what pairs, threes of a kind, fours of a kind, and flushes
exist in each hand... Re-pivot, and stuff everything into a table. Oh,
and I implemented some rudimentary batching so that 2.5 million rows
wouldn't be inserted in a single transaction.
&lt;/p&gt;&lt;p&gt;
The complete SQL is here:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="CODE"&gt;CREATE TABLE dbo.PokerHands&lt;br&gt;(&lt;br&gt;	Card1 INT NOT NULL,&lt;br&gt;	Card2 INT NOT NULL,&lt;br&gt;	Card3 INT NOT NULL,&lt;br&gt;	Card4 INT NOT NULL,&lt;br&gt;	Card5 INT NOT NULL,&lt;br&gt;	Weight BIGINT NOT NULL,&lt;br&gt;	CONSTRAINT PK_PokerHands PRIMARY KEY CLUSTERED&lt;br&gt;		(Card1, Card2, Card3, Card4, Card5)&lt;br&gt;)&lt;br&gt;GO&lt;br&gt;&lt;br&gt;DECLARE @CardId INT&lt;br&gt;SET @CardId = 1&lt;br&gt;&lt;br&gt;WHILE @CardId &amp;lt;= 52&lt;br&gt;BEGIN&lt;br&gt;	INSERT dbo.PokerHands&lt;br&gt;		(Card1, Card2, Card3, Card4, Card5, Weight)&lt;br&gt;	SELECT &lt;br&gt;		y.Card1,&lt;br&gt;		y.Card2,&lt;br&gt;		y.Card3,&lt;br&gt;		y.Card4,&lt;br&gt;		y.Card5,&lt;br&gt;		-- 2&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 2 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 2)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 2)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 2)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 3&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 4 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 4)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 4)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 4)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 4&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 8 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 8)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 8)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 8)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 5&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 16 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 16)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 16)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 16)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 6&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 32 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 32)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 32)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 32)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 7&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 64 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 64)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 64)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 64)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 8&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 2 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 128)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 128)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 128)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 9&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 256 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 256)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 256)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 256)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- 10&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 512 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 512)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 512)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 512)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- J&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 1024 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 1024)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 1024)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 1024)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- Q&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 2048 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 2048)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 2048)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 2048)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- K&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 4096 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 4096)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 4096)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 4096)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		-- A&lt;br&gt;		CASE &lt;br&gt;			SUM(&lt;br&gt;				CASE y.CardWeight&lt;br&gt;					WHEN 8192 THEN 1&lt;br&gt;					ELSE 0&lt;br&gt;				END&lt;br&gt;			)&lt;br&gt;			WHEN 4 THEN 9500000000 + (131072 * 8192)&lt;br&gt;			WHEN 3 THEN 6000000000 + (131072 * 8192)&lt;br&gt;			WHEN 2 THEN 2000000000 + (8192 * 8192)&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		--Flush&lt;br&gt;		CASE COUNT(DISTINCT y.CardSuit)&lt;br&gt;			WHEN 1 THEN 5800000000&lt;br&gt;			ELSE 0&lt;br&gt;		END +&lt;br&gt;		--Base Weight + Straight&lt;br&gt;		MIN(y.TotalWeight) AS Weight&lt;br&gt;	FROM&lt;br&gt;	(&lt;br&gt;		SELECT TOP 100 PERCENT&lt;br&gt;			C1.CardId AS Card1,&lt;br&gt;			C2.CardId AS Card2,&lt;br&gt;			C3.CardId AS Card3,&lt;br&gt;			C4.CardId AS Card4,&lt;br&gt;			C5.CardId AS Card5,&lt;br&gt;			--Straight&lt;br&gt;			CASE WHEN &lt;br&gt;				(C2.Weight = 2 * C1.Weight&lt;br&gt;				AND C3.Weight = 2 * C2.Weight&lt;br&gt;				AND C4.Weight = 2 * C3.Weight&lt;br&gt;				AND (&lt;br&gt;					(C5.Weight = 2 * C4.Weight)&lt;br&gt;					OR &lt;br&gt;					(C2.Card = '2' AND C5.Card = 'A')))&lt;br&gt;				THEN &lt;br&gt;					CASE &lt;br&gt;						WHEN (C2.Card = '2' AND C5.Card = 'A')&lt;br&gt;							THEN 5500000000 - 8191 --Make A = weight 1&lt;br&gt;						ELSE 5500000000&lt;br&gt;					END&lt;br&gt;				ELSE 0&lt;br&gt;			END +&lt;br&gt;			--Base weight&lt;br&gt;			C1.Weight + C2.Weight + C3.Weight + C4.Weight + C5.Weight&lt;br&gt;				AS TotalWeight,&lt;br&gt;			CASE x.number&lt;br&gt;				WHEN 1 THEN C1.Weight&lt;br&gt;				WHEN 2 THEN C2.Weight&lt;br&gt;				WHEN 3 THEN C3.Weight&lt;br&gt;				WHEN 4 THEN C4.Weight&lt;br&gt;				WHEN 5 THEN C5.Weight&lt;br&gt;			END AS CardWeight,&lt;br&gt;			CASE x.number&lt;br&gt;				WHEN 1 THEN C1.Suit&lt;br&gt;				WHEN 2 THEN C2.Suit&lt;br&gt;				WHEN 3 THEN C3.Suit&lt;br&gt;				WHEN 4 THEN C4.Suit&lt;br&gt;				WHEN 5 THEN C5.Suit&lt;br&gt;			END AS CardSuit&lt;br&gt;		FROM dbo.Cards C1&lt;br&gt;		JOIN dbo.Cards C2 ON &lt;br&gt;			C2.CardId &amp;gt; C1.CardId&lt;br&gt;		JOIN dbo.Cards C3 ON &lt;br&gt;			C3.CardId &amp;gt; C2.CardId&lt;br&gt;		JOIN dbo.Cards C4 ON&lt;br&gt;			C4.CardId &amp;gt; C3.CardID&lt;br&gt;		JOIN dbo.Cards C5 ON&lt;br&gt;			C5.CardId &amp;gt; C4.CardId&lt;br&gt;		CROSS JOIN&lt;br&gt;		(&lt;br&gt;			SELECT number&lt;br&gt;			FROM master..spt_values&lt;br&gt;			WHERE type='p'&lt;br&gt;				AND number BETWEEN 1 and 5&lt;br&gt;		) x&lt;br&gt;		WHERE C1.CardId = @CardId&lt;br&gt;		ORDER BY&lt;br&gt;			C1.CardId,&lt;br&gt;			C2.CardId,&lt;br&gt;			C3.CardId,&lt;br&gt;			C4.CardId,&lt;br&gt;			C5.CardId&lt;br&gt;	) y&lt;br&gt;	GROUP BY&lt;br&gt;		y.Card1,&lt;br&gt;		y.Card2,&lt;br&gt;		y.Card3,&lt;br&gt;		y.Card4,&lt;br&gt;		y.Card5&lt;br&gt;&lt;br&gt;	SET @CardId = @CardId + 1&lt;br&gt;END&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;... And that's about it! Assuming I made no logic mistakes -- which
of course is a very safe assumption because it's me and I never do that
-- this is a relatively good start to the Challenge! So take it from
here... Or at least until I post the next piece of the puzzle.
&lt;/p&gt;&lt;p&gt;In retrospect, I think this would be a good use for a CLR UDT.
If I have some time, I'll post that version in the coming days. Enjoy!&lt;/p&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=102" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Looping over routines using sp_foreachroutine</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/looping-over-routines-using-sp-foreachroutine.aspx</link><pubDate>Thu, 13 Jul 2006 01:45:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:101</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>5</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/101.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=101</wfw:commentRss><description>Of all of the undocumented stored procedures shipped with SQL Server, there are two in particular that I constantly use: &lt;a href="http://www.databasejournal.com/features/mssql/article.php/3441031" target="#"&gt;sp_MSforeachtable and sp_MSforeachdb&lt;/a&gt;.
These procedures internally loop over each non-Microsoft shipped (i.e.
user-defined) table in the current database, or each database on the
current server, respectively. During this loop, the procedures perform
whatever action(s) are specified by the user (in the parameters). For
instance, what if you want to re-index every table in the database?
Sure, you could write your own cursor, but &lt;i&gt;why bother?&lt;/i&gt;  Use the following T-SQL instead:
&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;EXEC sp_MSforeachtable 'DBCC DBREINDEX(''?'')'&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;
Convenient, isn't it? But I won't get into any more detail on these.
Gregory Larsen does a good job of that in the article linked above.
&lt;/p&gt;&lt;p&gt;What I'd like to show instead is a very simple modification
I've made to sp_MSforeachtable. It's great to loop over tables and
databases, but sometimes we want to loop over &lt;i&gt;routines&lt;/i&gt;
(a collective term for procedures, functions, triggers, and views)
instead. Perhaps you want to grant pemissions to a user. Or perhaps you
want to roll out some &lt;a href="http://www.datamanipulation.net/TSQLMacro"&gt;TSQLMacro&lt;/a&gt;
updates to every routine in the database instead of just one, as is
supported by the current version of the framework... And now you know
how it will be done in the next version.
&lt;/p&gt;&lt;p&gt;
Presenting &lt;b&gt;sp_foreachroutine&lt;/b&gt;:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;CREATE PROCEDURE dbo.sp_foreachroutine&lt;br&gt;	@command1 nvarchar(2000), &lt;br&gt;	@replacechar nchar(1) = N'?', &lt;br&gt;	@command2 nvarchar(2000) = null,&lt;br&gt;	@command3 nvarchar(2000) = null, &lt;br&gt;	@whereand nvarchar(2000) = null,&lt;br&gt;	@precommand nvarchar(2000) = null, &lt;br&gt;	@postcommand nvarchar(2000) = null,&lt;br&gt;	@routinetype nvarchar(20) = null&lt;br&gt;AS&lt;br&gt;BEGIN&lt;br&gt;	/* This proc returns one or more rows for each procedure (optionally, matching @where), &lt;br&gt;		with each procedure defaulting to its own result set */&lt;br&gt;	/* @precommand and @postcommand may be used to force a single result set via a temp table. */&lt;br&gt;&lt;br&gt;	/* Preprocessor won't replace within quotes so have to use str(). */&lt;br&gt;	declare @mscat nvarchar(12)&lt;br&gt;	select @mscat = ltrim(str(convert(int, 0x0002)))&lt;br&gt;&lt;br&gt;	if (@precommand is not null)&lt;br&gt;		exec(@precommand)&lt;br&gt;&lt;br&gt;	/* Create the select */&lt;br&gt;&lt;br&gt;	declare @sql nvarchar(4000)&lt;br&gt;	set @sql =&lt;br&gt;		N'declare hCForEach cursor global for ' &lt;br&gt;		 + N' select ''['' + REPLACE(user_name(uid), N'']'', N'']]'') + '']'' + ''.'' + ' &lt;br&gt;			+ N' ''['' + REPLACE(object_name(id), N'']'', N'']]'') + '']'' ' &lt;br&gt;		 + N' from dbo.sysobjects o '&lt;br&gt;	         + N' where OBJECTPROPERTY(o.id, N''IsMSShipped'') = 0 '&lt;br&gt;		 + 	CASE @routinetype&lt;br&gt;				WHEN 'procedure' THEN ' and OBJECTPROPERTY(o.id, N''IsProcedure'') = 1 '&lt;br&gt;				WHEN 'function' THEN ' and (OBJECTPROPERTY(o.id, N''IsScalarFunction'') = 1 '&lt;br&gt;					+ ' or OBJECTPROPERTY(o.id, N''IsTableFunction'') = 1) '&lt;br&gt;				WHEN 'view' THEN ' and OBJECTPROPERTY(o.id, N''IsView'') = 1 '&lt;br&gt;				WHEN 'trigger' THEN ' and OBJECTPROPERTY(o.id, N''IsTrigger'') = 1 '&lt;br&gt;				ELSE ' and ( ' &lt;br&gt;					+ ' OBJECTPROPERTY(o.id, N''IsProcedure'') = 1 '&lt;br&gt;					+ ' or OBJECTPROPERTY(o.id, N''IsScalarFunction'') = 1 '&lt;br&gt;					+ ' or OBJECTPROPERTY(o.id, N''IsTableFunction'') = 1 '&lt;br&gt;					+ ' or OBJECTPROPERTY(o.id, N''IsView'') = 1 '&lt;br&gt;					+ ' or OBJECTPROPERTY(o.id, N''IsTrigger'') = 1 '&lt;br&gt;					+ ' ) '&lt;br&gt;			END&lt;br&gt;	         + COALESCE(@whereand, '')&lt;br&gt;&lt;br&gt;	exec(@sql)&lt;br&gt;	declare @retval int&lt;br&gt;	select @retval = @@error&lt;br&gt;	if (@retval = 0)&lt;br&gt;		exec @retval = sp_MSforeach_worker @command1, @replacechar, @command2, @command3&lt;br&gt;&lt;br&gt;	if (@retval = 0 and @postcommand is not null)&lt;br&gt;		exec(@postcommand)&lt;br&gt;&lt;br&gt;	return @retval&lt;br&gt;END&lt;br&gt;GO&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;Regular readers of this blog will note that the formatting isn't
consistent with my usual standards. But since this was a port from an
MS-written proc, I decided to keep things fairly consistent with what
was already there. I've also added an additional parameter that wasn't
present in sp_MSforeachtable: @routinetype, which lets the user select
a specific type of routine to loop over. So, for instance, if you only
want views, pass in 'view'. Same for functions ('function'), triggers
('trigger') and procedures ('procedure'). Pass in any other value -- or
leave it NULL -- and you'll get all routines in the database.
&lt;/p&gt;&lt;p&gt;This procedure keeps the sp_ prefix on purpose; it's meant to
be created in the master database, and makes use of the MS-shipped
sp_MSforeach_worker stored procedure, which lets it do its work.
&lt;/p&gt;&lt;p&gt;Using it is simple. ? is the default substitution character
(this can be changed using the @replacechar parameter). So to print a
list of all routines in the current database, use:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;EXEC sp_foreachroutine 'print ''?'''&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;
For just functions, use the optional @routinetype parameter:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;EXEC sp_foreachroutine 'print ''?''', @routinetype = 'function'&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;
Enjoy!&lt;/p&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=101" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Scripts/default.aspx">Scripts</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>SQL Server 2005 T-SQL: Aggregates and the OVER clause</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/sql-server-2005-t-sql-aggregates-and-the-over-clause.aspx</link><pubDate>Thu, 13 Jul 2006 01:44:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:100</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>3</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/100.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=100</wfw:commentRss><description>A new feature added to SQL Server 2005 for the sake of the windowing
functions is the OVER clause. Using this clause, you can specify
ordering or partitioning for the windowing functions. For instance, to
enumerate the names of all of the products in the AdventureWorks
database that have a list price, along with their list prices and the
rank of those prices compared to all of the other prices, the following
query can now be used:
&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;SELECT&lt;br&gt;	P.Name,&lt;br&gt;	P.ListPrice,&lt;br&gt;	DENSE_RANK() OVER (ORDER BY P.ListPrice DESC) AS PriceRank&lt;br&gt;FROM Production.Product P&lt;br&gt;WHERE &lt;br&gt;	ListPrice &amp;gt; 0&lt;br&gt;ORDER BY &lt;br&gt;	P.Name ASC&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;Name			List Price	PriceRank&lt;br&gt;-------------------------------------------------&lt;br&gt;All-Purpose Bike Stand	159.0000	44&lt;br&gt;AWC Logo Cap		8.9900		98&lt;br&gt;Bike Wash - Dissolver	7.9500		99&lt;br&gt;Cable Lock		25.0000		88&lt;br&gt;Chain			20.2400		93&lt;br&gt;Classic Vest, L		63.5000		66&lt;br&gt;Classic Vest, M		63.5000		66&lt;br&gt;Classic Vest, S		63.5000		66&lt;br&gt;Fender Set - Mountain	21.9800		91&lt;br&gt;Front Brakes		106.5000	55&lt;br&gt;Front Derailleur	91.4900		59&lt;br&gt;...&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;So what does this tell us? All-Purpose Bike Stand is the 44th most
expensive item sold by AdventureWorks. AWC Logo Cap is the 98th most
expensive item. And the Vests are tied for 66th most expensive. Which
is why DENSE_RANK was used for this example! But really, this example
is only here to demonstrate one use of the OVER clause. And this post
isn't about windowing functions or rankings at all. That's another post
for another day.
&lt;/p&gt;&lt;p&gt;
What this post is about is &lt;i&gt;normal, non-windowing aggregate functions&lt;/i&gt;.  Like SUM().  It turns out that the OVER clause can be used for them, too!
&lt;/p&gt;&lt;p&gt;Pretend that you're an employee of AdventureWorks and your
manager comes to you with a request: Write a query to return all of the
products, their prices, their subcategories, and the average price for
all products in the subcategory that any given product belongs to...
Why? Perhaps the manager wants to re-categorize products based on
whether they fall, percentage-wise, close to the same average price. Or
maybe it just makes a good contrived example for showing this feature!
Regardless...
&lt;/p&gt;&lt;p&gt;
Here's how you can solve this in SQL Server 2000:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;SELECT&lt;br&gt;	P.Name AS ProductName,&lt;br&gt;	P.ListPrice,&lt;br&gt;	PS.Name AS ProductSubCategoryName,&lt;br&gt;	x.AveragePrice&lt;br&gt;FROM Production.Product P&lt;br&gt;JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID&lt;br&gt;JOIN&lt;br&gt;(&lt;br&gt;	SELECT &lt;br&gt;		P2.ProductSubCategoryID,&lt;br&gt;		AVG(P2.ListPrice) AS AveragePrice&lt;br&gt;	FROM Production.Product P2&lt;br&gt;	WHERE &lt;br&gt;		P2.ProductSubCategoryID IS NOT NULL&lt;br&gt;	GROUP BY &lt;br&gt;		P2.ProductSubCategoryID&lt;br&gt;) x ON x.ProductSubCategoryID = P.ProductSubCategoryId&lt;br&gt;ORDER BY &lt;br&gt;	P.Name&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;I don't know about you (since I have no clue who you are), but I
personally have a difficult time reading this. If I came back to this
query in six months, it would take me a few minutes to figue out what
was going on. And doesn't it feel like there should be a more efficient
way of expressing it?
&lt;/p&gt;&lt;p&gt;
...Well, now there is...
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;SELECT&lt;br&gt;	P.Name AS ProductName,&lt;br&gt;	P.ListPrice,&lt;br&gt;	PS.Name AS ProductSubCategoryName,&lt;br&gt;	AVG(P.ListPrice) OVER (PARTITION BY P.ProductSubCategoryID)&lt;br&gt;FROM Production.Product P&lt;br&gt;JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID&lt;br&gt;ORDER BY &lt;br&gt;	P.Name&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;So what's going on here? Under the covers, SQL Server builds a
subquery for the average, based on the partitioning column of the OVER
clause -- which in this case is ProductSubCategoryID. It's a little bit
less efficient in this case than the derived table approach, but a lot
cleaner from a readability standpoint. Personally, I think it's a
really cool feature, although I don't honestly see myself using it too
often.
&lt;/p&gt;&lt;p&gt;
More ways to &lt;i&gt;express yourself&lt;/i&gt; using SQL Server 2005.  Madonna would be proud.&lt;/p&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=100" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category></item></channel></rss>