<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www2.sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>T-SQL User-Defined Functions: the good, the bad, and the ugly (part 2)</title><link>http://www2.sqlblog.com/blogs/hugo_kornelis/archive/2012/05/26/t-sql-user-defined-functions-the-good-the-bad-and-the-ugly-part-2.aspx</link><description>In a previous blog post , I demonstrated just how much you can hurt your performance by encapsulating expressions and computations in a user-defined function (UDF). I focused on scalar functions that didn’t include any data access. In this post, I will</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>re: T-SQL User-Defined Functions: the good, the bad, and the ugly (part 2)</title><link>http://www2.sqlblog.com/blogs/hugo_kornelis/archive/2012/05/26/t-sql-user-defined-functions-the-good-the-bad-and-the-ugly-part-2.aspx#43600</link><pubDate>Sun, 27 May 2012 05:32:10 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:43600</guid><dc:creator>Paul White</dc:creator><description>&lt;p&gt;Hi Hugo,&lt;/p&gt;
&lt;p&gt;I realise the main point of this post is to show the poor performance characteristics of scalar UDFs that access data (and associated costing limitations) but you expanded the discussion to optimizer stuff, so I have some comments... :)&lt;/p&gt;
&lt;p&gt;The small sizes of your example tables do not prompt a full effort from the optimizer. &amp;nbsp;Faced with relatively small costs for obvious plans, it terminates its search early with Good Enough Plan Found, which is a good thing in general.&lt;/p&gt;
&lt;p&gt;The point about trying different syntax (while being careful about semantics!) is a good one. &amp;nbsp;The optimizer will never include all possible logical transformations, just a good set that produce benefits for a wide range of common queries (trading ultimate plan quality for compilation time).&lt;/p&gt;
&lt;p&gt;On that note, a useful and natural way to express the query requirement is:&lt;/p&gt;
&lt;p&gt;SELECT&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;MAX(t.Triple) &lt;/p&gt;
&lt;p&gt;FROM dbo.Triples AS t&lt;/p&gt;
&lt;p&gt;WHERE EXISTS&lt;/p&gt;
&lt;p&gt;(&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;SELECT 1&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;FROM dbo.LargeTable AS lt &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;WHERE &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;lt.DataVal = t.Value&lt;/p&gt;
&lt;p&gt;);&lt;/p&gt;
&lt;p&gt;This produces a plan with the pre-aggregation you are looking for, but we can do even better by adding the obvious index on LargeTable (DataVal):&lt;/p&gt;
&lt;p&gt;CREATE NONCLUSTERED INDEX nc1 ON dbo.LargeTable (DataVal);&lt;/p&gt;
&lt;p&gt;That gives us a plan with no aggregation at all, just two seeks, a semi-join, and a stream aggregate. &amp;nbsp;The estimated cost is 0.0305208, compared with 0.296198 for the manual rewrite using DISTINCT.&lt;/p&gt;
&lt;p&gt;Though not immediately important in the specific case you show, we could also help the optimizer by creating a UNIQUE constraint on Triples (Triple), and enforce a FOREIGN KEY relationship from LargeTable (DataVal) to Triples (Value).&lt;/p&gt;
&lt;p&gt;I generally prefer APPLY to subqueries (a sub-SELECT in an outer SELECT clause). &amp;nbsp;APPLY tends to optimize better in my experience, especially when useful FOREIGN KEY and NOT NULL constraints are present.&lt;/p&gt;
&lt;p&gt;Paul&lt;/p&gt;
</description></item></channel></rss>