THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Adam Machanic

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

  • TechEd North America 2014 (Houston) - Demos - DBI-B489 and DBI-B333

    Just a quick note to thank everyone who attended both of my sessions at this year's TechEd show in Houston!

    Attached are demos for both the query tuning and window functions talks.

    Enjoy, and let me know if you have any questions.

     

  • SQL Saturday Chicago (#291) - Clash of the Row Goals - Demos

    Thank you to everyone who joined me today at SQL Saturday Chicago for my "Clash of the Row Goals" session. I'm honored that so many people would attend a talk with such an incredibly arcane title -- and always glad to find a curious, interested, and intelligent audience.

    The demos I showed during the session are attached to this post. If you have any followup questions, feel free to leave a comment below.

    Enjoy!

  • SQL Saturday Chicago...and the Gentle Art of Planscaping!

    Three weeks from now I'll be visiting Chicago to participate in SQL Saturday. I'll offer two different learning opportunities:

    On Friday, April 25, you can join me for a full day seminar on parallelism and performance in SQL Server. This class covers in-depth background on the hows and whys behind SQL Server's parallel processing, followed by an investigation of advanced techniques that you can use to improve performance of some queries by an order of magnitude or more. The class is filling up, so make sure to register soon!

    On Saturday, April 26, I'll deliver my newest talk, "Clash of the Row Goals." I unveiled it the first time this past weekend at SQL Saturday Boston, and Joe Chang happened to blog about it a few hours ago. In this talk you'll learn how to shape and modify query plans by leveraging an understanding of query optimizer internals -- without using query hints. I've decided to call this technique planscaping. The idea is to create beautiful query plans that perform well and don't break at unexpected times. That's a good thing, right?

    Hope to see you in Chicago!


  • Demos: SQL Saturday Boston (#262) - Clash of the Row Goals!

    Thanks to everyone who joined me yesterday afternoon in Cambridge for my Clash of the Row Goals talk. It was great to receive such a welcome reception for a rather out there topic.

    Three things I should clarify for those who attended:

    • The Many() function appeared to create an invalid estimate in one case; but that was due to auto-parameterization (which one of the attendees in the front row correctly called out at the time; but I'm not sure everyone heard her)
    • Regarding the demo that failed at the end: As it turns out -- it didn't! I was a bit under-caffeinated and misunderstood my own notes. I've updated the scripts to make things clearer.
    • I blamed the demo failure, jokingly, on Azure. Turns out at least one attendee took me seriously; it showed up on a feedback card. Fact is, I was running the entire show from my laptop. My head might have been in the clouds, but my demos were strictly on terra firma.

    The demo scripts are attached. Enjoy, and let me know if you have any questions or comments. I'd especially love to hear how it goes if you decide to try any of these techniques on one of your problem queries.

  • Query Performance and Parallelism Seminars in Boston and Chicago

    You're a DBA or database developer working on a big (or even big-ish -- doesn't have to be huge) database. Your end-users are doing all sorts of data analysis, and even though you have a pretty beefy server, your queries just don't seem to run as fast as they should.

    The reason (I bet)? Your queries aren't taking full advantage of the CPU power available in today's servers. SQL Server has powerful parallel processing capabilities, but left to its own devices (and defaults) you're never going to see the full potential. Luckily, this can be fixed. You can make your queries much faster. (Often an order of magnitude or more!) And I can show you how.

    Join me in Boston or in Chicago, the Friday before the upcoming SQL Saturday events in those cities, and we'll go deep into parallel processing. You'll learn about Windows server and how it deals with processes and threads, about SQLOS and its thread scheduling internals, how the SQL Server query processor works, and how you can use all of this knowledge to make your queries run faster and your server behave more consistently, even under load.

    These full-day seminars cost only $150 and include lunch, coffee, and a printed course book -- plus a full day of intense content.

    Full information, including an outline and registration details, is available at the links below:

    Hope to see you there!

  • SQLRally Nordic and SQLRally Amsterdam: Wrap Up and Demos

    First and foremost: Huge thanks, and huge apologies, to everyone who attended my sessions at these events. I promised to post materials last week, and there is no good excuse for tardiness. My dog did not eat my computer. I don't have a dog. And if I did, she would far prefer a nice rib eye to a hard chunk of plastic.

    Now, on to the purpose of this post...

    Last week I was lucky enough to have a first visit to each of two amazing cities, Stockholm and Amsterdam. Both cities, as mentioned previously on my blog, hosted SQLRally events, and I did a precon plus two talks at each event.

    The events in both cities were well done and the audiences seemed quite sharp. Very nice events at which to be a speaker! I hope to return to both areas very, very soon. (Hint, hint, event organizers!)

    Precon attendees: You should already know where to get your slides and demos.

    Breakout session attendees: Find the demos for both of my talks attached here.

    Thanks again to everyone who I met and/or taught. It was really a fantastic week. Skål and/or proost -- depending on your disposition!

  • Query Performance Sessions in Stockholm and Amsterdam

    As previously mentioned, I'll be delivering my full-day "No More Guessing" performance troubleshooting seminar at both the Stockholm and Amsterdam SQLRally events. In addition to the seminar, I'll be doing two breakout sessions at each event.

    It's going to be a busy week! But luckily for me, I'm doing the same things in each city:

     

    Using Your Brain to Beat SQL Server

    This session is a joint effort with one of the world's greatest SQL superheroes, Thomas Kejser. The session is all about query optimizer deficiencies and how to win the battle for excellent performance by applying your own intelligence instead of blindly trusting that SQL Server will do the right thing. I think this is going to be a really fun and useful session.

     

    Data, Faster: SQL Server Performance Techniques with SQLCLR

    Remember when SQLCLR was first put into SQL Server and everyone was wondering what they should use it for? I found my answer: raw performance. In this session I'll teach you a SQLCLR pattern that can help reduce the run time of your heaviest and most logic-filled queries, often by an order of magnitude or more.

     

    Looking forward to seeing you in Europe next month!

  • Hey Amsterdam and Stockholm: Just Say No to Guessing About Performance!

    Have SQL Server performance problems? What are you going to look at? Can you quickly isolate the root cause? Or do you ever find yourself stuck, not sure of what to do next?

    That's what the No More Guessing seminar is all about.

    This seminar teaches you to use:

    • Baselining: An analytical methodology based on statistical analysis and evidence rather than graphs and your eyes.
    • Performance counters. The right set of counters to get you where you need to be without being overwhelmed.
    • Wait stats: Including what waits really mean and an extremely in-depth overview of the 15 most common waits you will encounter on real production systems.
    • DMVs: The 15 DMVs that you can use to find out why things are slow on your server right now
    • Query plans: The top things to look for in your biggest, most confusing plans so that you can quickly find the culprit. And, naturally, how to fix the problem.

    All of these techniques will work with any version of SQL Server after 2005, although 2008 and beyond are recommended. (The seminar covers some specific very useful features added in both 2008 and 2012. As soon as 2014 is ready, I look forward to adding some really cool stuff from there as well. SQL Server 2005 just isn't as exciting as it once was.)

    This November I am pleased to be teaching this seminar twice in Europe, at the two SQLRally events:

    I'm also delivering two breakout sessions at each show. More information on those soon.

    Hope to see you in Europe!

  • Next-Level Parallel Plan Forcing: An Alternative to 8649

    "Even experts with decades of SQL Server experience and detailed internal knowledge will want to be careful with this trace flag.  I cannot recommend you use it directly in production unless advised by Microsoft, but you might like to use it on a test system as an extreme last resort, perhaps to generate a plan guide or USE PLAN hint for use in production (after careful review)."

    So wrote Paul White in his often referenced article, "Forcing a Parallel Query Execution Plan." His article focuses on the various reasons that you might not get a parallel query plan, chief among them the optimizer simply not doing a great job with its own cost model. (My session from the 2012 PASS conference, available on YouTube, also discusses the issue in some detail, and from a different perspective. You might want to both watch it and read Paul's article, prior to reading any further here.)

    The trace flag Paul mentions, 8649, is incredibly useful. It allows us to tell the query optimizer to disregard its cost-based comparison of potential serial and parallel versions of our plans, thereby skipping right to the good stuff: a parallel plan (when it's possible to generate one). Alas, the flag is undocumented, unknown, and its full impact difficult to understand. And for those cowboys who are willing to take the plunge and use it in a production environment, there are still other issues lying in wait:

    • This flag is not something you want to enable on a system-wide basis. You can use DBCC TRACEON on a case-by-case basis, but as Paul shows in his post it's much nicer to use the QUERYTRACEON hint. Unfortunately, this hint was, just like the flag, entirely undocumented until only recently. It is now documented only for a very small number of flags; 8649 isn't one of them.
    • Just like DBCC TRACEON, QUERYTRACEON requires significant system-level privileges: system administrator, to be exact. This means that you're forced to either give that permission to all of your users, or do some module signing. Clearly the latter approach is far superior, but it's still a pain.
    • QUERYTRACEON, being a query-level hint, can't be encapsulated in a view or inline table-valued function. So if we want to force these to go parallel, we're pretty much out of luck.

    Clearly, while invaluable for testing, 8649 just isn't the parallel panacea we need.

    Before we get into the details of the solution I'll be presenting here, it’s probably a good idea to do a quick review of the situation.

    Here’s the problem, in brief:

    expensive_plan_no_parallel

    Big plan, no parallelism. Ouch. Or maybe I should say that more slowly. OOOOOOOOuuuuuuuucccccccchhhhhhhh. Because you’re going to be waiting for quite a while for this plan to finish.

    Why did this happen? We can find out by forcing a parallel version of the plan using 8649 and evaluating the metadata.

    parallel_cost

    When the query optimizer decides whether or not to select a parallel plan, it first comes up with a serial plan cost, then compares it to the parallel plan cost. If the parallel plan cost is lower, that's the one that gets used. This parallel plan cost is based on the serial cost, but with a few key modifications:

    • CPU costs for each iterator are divided by the degree of parallelism...except when they sit beneath a parallel nested loop. (Except when the input to the parallel nested loop is guaranteed to be one row. That’s a very special parallelism edge case.) Notice the nested loop iterator above? Strike one. (Note: I/O costs aren’t divided by anything.)
    • Sometimes, the parallel plan will have slightly different iterators than the serial version, due to the optimizer doing its job. Notice, above, that the iterator feeding the nested loop has transitioned from a Merge Join to a Hash Match? That’s going to change the cost. In this case, it’s more expensive. Strike two.
    • Parallel iterators are added into the plan as necessary. See that Gather Streams? It’s not free. Strike three.

    Adding all of these modifications together, our parallel plan has a higher cost than our serial plan. Cost is the query optimizer’s way of weighing the relative merits of one plan against another, and just like you might choose the less expensive option at a store, so has the query optimizer. A plan with a cost 10 less than some other plan must perform better, right? Well…no. Not in the real world, and especially not when we're dealing with plans that have estimated costs in the millions.

    Unfortunately, as I mention in the video linked above, the costing model is rather broken. And the workaround I suggest in the video--using a TOP with a variable in conjunction with OPTIMIZE FOR--can work, but it has some problems. The biggest issue, as far as I’m concerned? It requires use of a local variable. Which, just like 8649, means that it can’t be used in a view or inline TVF.

    So what’s a SQL developer to do?

    Recently it hit me. If I could only create a query fragment that had certain properties, I could apply it as needed, just like 8649. Here’s what I set out to create:

    • High CPU cost, low I/O cost. This is key. The query fragment had to be able to benefit from the query optimizer’s math.
    • No variables. See above.
    • A single, self-contained unit. No tables or other outside objects. As much as possible, avoidance of future maintenance issues seemed like a good approach.
    • No impact on estimates in the core parts of the plan. This query fragment had to impact plan selection purely on the basis of cost, and without causing the optimizer to make strange choices about join types, execution order, and so on.

    After quite a bit of trial and error I arrived at my solution, which I encapsulated into the following table-valued function:

    CREATE FUNCTION dbo.make_parallel()
    RETURNS TABLE AS
    RETURN
    (
        WITH
        a(x) AS
        (
            SELECT
                a0.*
            FROM
            (
                VALUES
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1),
                    (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)
            ) AS a0(x)
        ),
        b(x) AS
        (
            SELECT TOP(9223372036854775807)
                1
            FROM
                a AS a1,
                a AS a2,
                a AS a3,
                a AS a4
            WHERE
                a1.x % 2 = 0
        )
        SELECT
            SUM(b1.x) AS x
        FROM
            b AS b1
        HAVING
            SUM(b1.x) IS NULL
    )
    GO

    What this does: The function starts with a set of 1024 rows, defined in the row value constructor in CTE [a]. This set of rows is cross-joined to itself four times. The resultant Cartesian set contains 1,099,511,627,776 rows, all of which are forced to pass through a Top iterator as well as a Stream Aggregate. This is, naturally, a hugely expensive operation that generates a very high estimated cost.

    Except that in reality there are only 1024 rows. Notice the predicate in CTE [b]: (a1.x % 2 = 0). If you've studied a bit of math you know that 1 divided by 2 has a remainder of 1, not 0. But luckily for us, the query optimizer has no way of evaluating that at compile time. It instead asks its sibling, the query processor, to do that work. The plan involves scanning each of the four cross joined sets for a match, but of course no match is ever found. And since cross joining an empty set to any other set results in an empty set, the query processor has no need to scan any but the first set of rows it encounters. So at run time 1024 rows are touched, and that's that.

    Here's the catch (and it's a good one, for us): Since the query optimizer doesn't know that our predicate will never return true, it also doesn't know how many rows will actually be touched. And it's forced to assume the worst, that the full cross product will be processed. Therefore, this function delivers a rather large cost estimate of just under 1,000,000. (Serial cost; parallel will be less, due to the aforementioned math.) This number, as a cost goal, is somewhat arbitrary. I originally came up with a function that delivered a cost in the billions, but it added a lot of complexity to the query plans I was working with. So I scaled it back a bit. 1,000,000 should be fine in just about every case. The example above is fairly typical; I usually see these cost problems crop up with very small relative differentials between the serial and parallel plan cost. The really important thing is that 100% of the estimated cost from this function is CPU time. That means that we can take full advantage of the way the optimizer works.

    Of course, almost none of the cost is real. This UDF will add a couple of dozen milliseconds to your plan. It will also add around 100ms to the compile time. In my opinion, that doesn’t matter. If you’re playing in the big parallel workload arena you’re not doing thousands of batch requests a second. You’re trying to get your two hour query down to a reasonable amount of time. No one is going to care about a dozen milliseconds.

    This function is also engineered so that the output number of rows is guaranteed to be no greater than one. See that SUM, with no GROUP BY? The query optimizer knows that that can’t return more than one row. And that’s a good thing. It gives us that parallelism edge case I mentioned above. (Nested loop with an input guaranteed to be exactly one row.) Another thing? No rows will ever actually be aggregated in that SUM. Its result will always be NULL. But the query optimizer has no way of knowing that, and it comes up with a plan where the entire backing tree needs to be evaluated. That’s why the HAVING clause is there.

    Using this function is quite simple. You take your query, wrap it in a derived table expression, and CROSS APPLY into it. No correlation required. Let’s pretend that the query we want to force parallel looks something like this:

    SELECT
        a.Col1,
        b.Col2
    FROM TableA AS a
    INNER JOIN TableB AS b ON
        a.PK = b.PK

    Using the make_parallel function to force this to go parallel is as simple as:

    SELECT
        x.*
    FROM dbo.make_parallel() AS mp
    CROSS APPLY
    (
        SELECT
            a.Col1,
            b.Col2
        FROM TableA AS a
        INNER JOIN TableB AS b ON
            a.PK = b.PK
    ) AS x

    The reason we CROSS APPLY from the function into the query is to keep the high cost on the outer side of any parallel nested loops. This way, the query optimizer’s parallelism math will work it’s magic the right way, yielding the parallel plan we’re after. CROSS APPLY in this case—uncorrelated—can only be optimized as a nested loop itself, and that loop only makes sense if there is at least one row feeding it. Therefore, this query is logically forced to process the TVF first, followed by the inside of table expression [x].

    Note that just as with trace flag 8649, there are a number of parallel inhibitors that are still going to keep this from working. And note that unlike when using the trace flag, the base cost of just under 1,000,000 means that even if you have predicates in certain cases that make a parallel plan less than ideal, you’re still going to get a parallel plan. Using this function is effectively applying a big huge hammer to a problematic nail. Use it with caution, make sure it’s appropriate, and don't bash your thumb.

    So what does a query plan look like, once this has been applied? Here’s the same query from the screen shots above, with the function in play:

    now_parallel

    Hopefully (without squinting too much) you can see the key attributes: There is a new subtree on top of the original query. That’s thanks to the TVF. This subtree is evaluated, and feeds the top Distribute Streams iterator, which then feeds the Nested Loops iterator. Since the input to that Distribute Streams is guaranteed to be one row, it uses Broadcast partitioning. But there is no correlation here, so there is really nothing to broadcast; the net effect of this action is to prepare a set of threads to do the work of processing your query in parallel.

    Under the Nested Loops? The same exact plan shape that was produced when I used the trace flag. Same estimates, same iterators, and so on and so forth. The UDF necessarily impacts the plan as a whole, but it works as intended and does not impact any of the parts of the plan we actually care about -- the parts we need to make faster in order to get data to our end users.

    I’m really excited about this technique. It means that I can play fewer games, worry less about privileges and the potentially negative impact of using undocumented hints, and concentrate more on the data I need the query to return than the physical manner in which it’s being processed.

    Bear in mind that there are myriad other problems with SQL Server’s parallel processing capabilities. This technique merely solves one of them, and in all honesty it’s neither the cleanest nor best solution that I can imagine. It’s a start. My hope is that these issues will eventually be addressed by the query optimizer and query processor teams. In the meantime, we as end-users have no choice but to continue to push the product in an effort to get the best possible performance, today, for the queries that we need to run today. Tomorrow, perhaps, will hold a better story.

    Disclaimer: This technique works for me. It may not work for you. It may cause some problem that I haven't yet considered. It may cause your server to literally explode, sending fragments of plastic and metal all over your data center. The function and technique are provided as-is, with no guarantees or warranties, and I take no responsibility for what you do with them. That said, if you DO try this out I'd love to hear about your experience, in the comments section below!

    Enjoy!

    Special thanks to Gokhan Varol for helping me test this technique.

  • This November, Join Me in Stockholm and Amsterdam

    Late last year, I was invited by Raoul Illyés, a SQL Server MVP from Denmark, to present a precon at the 2013 edition of SQLRally Nordic. I agreed and decided to skip the US PASS Summit this year and instead visit an area of Europe I've never seen before. A bonus came a while later when I learned that there is another SQLRally in Europe that same week: SQLRally Amsterdam. Things worked out in just the right way and today I'm happy to announce that I'll be speaking at both events, back-to-back. Should be quite a week!

    SQLRally Nordic will take place in Stockholm, Sweden, November 4-6.

    SQLRally Amsterdam will be in Amsterdam, naturally. November 6-8.

    Yes, there is a one-day overlap between these two events, and I'll have to leave Sweden a day early to deliver my seminar in Amsterdam. First world problems.

    At both events I'll be delivering my popular No More Guessing! seminar, which teaches a solid and proven performance troubleshooting methodology. I'll be adding some new content and refining some things; watch this space in the coming weeks and months for an updated outline and some information on the newer material.

    Both events are now open for early registration, and both events have an open call for speakers. If you're in Europe, or would like to visit, I highly recommend taking advantage of one or both of these options!

    Look forward to meeting many of you at these events. See you in November!

  • TechEd 2013: Demos for "Data, Faster: Microsoft SQL Server Performance Techniques with SQLCLR"

    Today at TechEd in New Orleans I delivered a breakout session entitled "Data, Faster: Microsoft SQL Server Performance Techniques with SQLCLR."

    This session covered a number of techniques for using SQLCLR as a query tuning tool, especially for big, ugly, and heavily analytical queries.

    Thank you so much for all who attended and took the time to evaluate the session. For those of you who weren't there, the video will (apparently) be posted soon on the link above.

    The full set of demo material is attached to this post. Leave a comment if you have any questions for me.

    Enjoy, and may your queries be ultra-speedy!

  • SQL Saturday #220 (Atlanta): Demos

    Today at SQL Saturday #220 in Atlanta I presented a new brand new session, "SQL Server Query Plan Analysis: The 5 Culprits That Cause 95% of Your Performance Headaches."

    This session is designed to help people quickly analyze query plans and find likely culprits without being query tuning experts; I find that in a huge number of cases the root cause of problems is one of just a few potential situations.

    Thanks to everyone who joined me today for the deliveries! In addition to it being a new session this was also the first time that I've ever been asked to present the same session twice in a single day. So it was quite an experience.

    The demo script is attached. As always, let me know if you have any questions or comments.

    Enjoy!

  • INSERT SELECT is Broken. Ask Microsoft to Fix It.

    Imagine that you're moving thousands or millions of rows between two tables. Maybe it's between a staging table and a data warehouse in an ETL process. Maybe you're manipulating some data via a temp table as you're preparing lookup data for your OLTP system. Maybe you're preparing data for an end user.

    It doesn't really matter what your use case, because there are so many of them. We, as database developers, spend all day moving data back and forth.

    Unfortunately, sometimes our processes break. It's just a fact of life. Sometimes a bad row will sneak in and will violate a constraint, or won't be implicitly convertable to the target data type, or whatever. It happens all the time, and when it does debugging the problem is excruciatingly painful and overly time-consuming.

    The error messages aren't great, it's often difficult to identify the problem row, and so we wind up manually editing out parts of code and slogging through row after row hoping to find the issue.

    When this happens, the entire INSERT, or UPDATE, or whatever breaks. Even if we would have really liked all of the rows except the single bad row to get through.

    Wouldn't it be nice if there were a better way to tackle this problem?

    Someone named Dan Holmes thought up the perfect solution, and posted a Connect item

    Microsoft, in its infinite wisdom, has responded that this scenario is "not common enough," and has stated that it will be closing the issue. Let's not let that happen.

    If you agree that this is a major concern and that this feature would help you, please click on the link above and vote. Leave a comment, too.

    Thank you.

     

    Note: This doesn't just apply to INSERT SELECT, but I wanted to get your attention. And if you're reading this, it worked!

  • SQLCLR Performance Session at TechEd US

    I am super-excited to visit New Orleans next month for Microsoft TechEd; it will be my sixth time speaking at the show.

    My session takes an in-depth look at some of the techniques I've developed for using SQLCLR modules -- and some of the great performance gains I've been able to achieve.

    Hope to see you in NOLA! If you're not attending the show, the video will be available on demand a few days after I give the talk.

  • More Fun in Atlanta: Parallelism at SQL Saturday 220

    May 18, SQL Saturday returns yet again to the Atlanta area. At this point I've become a bit of a regular at Atlanta's events; this will be my third one in a row. The team that puts them together is amazing, and produces top quality, super fun and educational days every time. Plus: Taco Mac.

    Friday, May 17, the event is running a few pre-conference seminars, and I'll be delivering one focused on parallelism in SQL Server. This is an updated version of the seminar I delivered at the 2010 PASS conference; you can read Kendra Little's review of that day on her personal blog, littlekendra.com.

    To get full information on the Atlanta seminar, visit the EventBrite page for the event: http://surfmulticore.eventbrite.com/

    Any questions? Let me know in the comment section.

    Hope to see you there!

This Blog

Syndication

Privacy Statement