<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www2.sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Adam Machanic : T-SQL, Architecture</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/Architecture/default.aspx</link><description>Tags: T-SQL, Architecture</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>Stored procedures are not parameterized views</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/stored-procedures-are-not-parameterized-views.aspx</link><pubDate>Thu, 13 Jul 2006 01:54:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:109</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>6</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/109.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=109</wfw:commentRss><description>Peter van Ooijen over at CodeBetter.com &lt;a href="http://codebetter.com/blogs/peter.van.ooijen/archive/2006/05/29/145697.aspx"&gt;posted in his blog about some observations he had when working with stored procedures in a recent project&lt;/a&gt;.
What I found to be interesting about his post was his comment that a
stored procedure can be, "a view with parameters."&amp;nbsp; I've run into this
assertion before, and it's something I think needs some clarification
for a lot of developers. I do not feel that there is any real
similarity between stored procedures and views -- they are entirely
different types of objects in an SQL database, and should not be
considered forms of one another in any way. &lt;br&gt;&lt;br&gt;Following is an edited version of the response I left in Peter's blog; I thought it warranted its own post:&lt;br&gt;&lt;br&gt;Stored procedures are not -- and never can be -- "parameterized views". &amp;nbsp;A view 
in an SQL database can be treated the same as a table in virtually every 
context.&amp;nbsp; Consider:&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;font-family:Courier New;"&gt;SELECT * &lt;br&gt;FROM Tbl &lt;br&gt;&lt;/div&gt;&lt;br&gt;vs. &lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;font-family:Courier New;"&gt;SELECT * &lt;br&gt;FROM 
View &lt;br&gt;&lt;/div&gt;&lt;br&gt;One
of the great things about working with views and tables is that the
person querying the database does not need to know whether the base
object being queried is a view or a table. For the sake of writing SQL
queries, they are one and the same. Both a view and a table have
well-defined columns, with well-defined datatypes.&lt;br&gt;&lt;br&gt;These
assertions cannot be made for a stored procedure as compared with a
view.&amp;nbsp; A stored procedure is related to a view only in as much as both
are defined using SQL syntax. But beyond there, the two diverge into
completely different types of entities. First of all, consider:&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;"&gt;&lt;span style="font-family:Courier New;"&gt;SELECT * 
&lt;/span&gt;&lt;br style="font-family:Courier New;"&gt;&lt;span style="font-family:Courier New;"&gt;FROM StoredProcedure &lt;/span&gt;&lt;br&gt;&lt;/div&gt;&lt;br&gt;This
will not work, and will only result in an "invalid object name"
exception. The reason? Stored procedures expose no explicit output
contract.&amp;nbsp; Thanks to conditional branching, dynamic SQL, and SELECT *,
a stored procedure can output vastly different results beween
invocations, or based on different input parameters. It is quite
possible to code a stored procedure that will output no result sets for
one set of input parameters, two result sets for another, and four for
another.&amp;nbsp; Or, it's possible to change the returned result sets, e.g. by
outputting different column names or datatypes. Please note, this is an
&lt;span style="font-style:italic;"&gt;extremely&lt;/span&gt; poor (and very
dangerous) coding habit to get into -- but the point is, it is
impossible to verify the output of a stored procedure for a given set
of input parameters without running it.&lt;br&gt;&lt;br&gt;Furthermore, a stored
procedure "late binds" to the base objects being queried.&amp;nbsp; This adds to
the difficulty in verifying the output of a stored procedure, and is
why you can create the following stored procedure without getting an
exception (until you try to run it, of course):&lt;br&gt;&lt;br&gt;&lt;div style="margin-left:40px;font-family:Courier New;"&gt;CREATE PROC XYZ &lt;br&gt;
AS &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT 
* &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM ThisTableDoesNotExist &lt;br&gt;
GO &lt;br&gt;&lt;/div&gt;
&lt;br&gt;These stored procedure behaviors are in stark 
contrast to the way views work.&amp;nbsp; Views provide a couple of means of verification: &lt;br&gt;&lt;ul&gt;&lt;li&gt;The output columns/data types can be verified, and bound to, before actually 
querying the view &lt;/li&gt;&lt;li&gt;A view can be "schema bound", meaning that the 
underlying base tables (or other views) which the view is based on cannot be 
changed, schema-wise, unless the view is dropped. &amp;nbsp; &lt;/li&gt;&lt;/ul&gt;For
the first point, simply query the INFORMATION_SCHEMA.COLUMNS or
sys.Columns views, and column information can be determined for a view
without having to query it.&lt;br&gt;&lt;br&gt;The second point adds to the first
in a few ways. Schema binding brings to views a certain sense of "early
binding," which as I mentioned is missing in stored procedures.
Although no view can be created if one of its base objects does not
exist, schema binding takes it one step further and guarantees that the
base objects used to create the view &lt;span style="font-style:italic;"&gt;must exist, and must not be changed&lt;/span&gt;,
for as long as the view is present in the database. This means that if
a schemabound view is created that outputs a certain set of columns
with certain datatypes, it is guaranteed to do so for as long as it
exists in the database -- in other words, its contract is bound to the
schema, and changes to other objects cannot affect it.&amp;nbsp; This is a
powerful guarantee, which stored procedures fail to make.&lt;br&gt;&lt;br&gt;So now
the question is, if a stored procedures isn't a parameterized view then
what is? The answer, as of SQL Server 2000 (and continuing in 2005), is
the table-valued UDF. &amp;nbsp;A table-valued UDF is parameterized, has an
explicit and verifyable output contract*, and can be schema bound. &amp;nbsp;If
you are looking to implement a solution that makes use of a form of
parameterized views, stored procedures are probably not the right
choice.&amp;nbsp; I think that table-valued UDFs are quite underused and deserve
a second (or first!) look from many T-SQL developers who may have
glossed over them in the past.&lt;br&gt;&lt;br&gt;*
Note: Unlike for a view, the column list for a table-valued UDF cannot
be queried from the INFORMATION_SCHEMA.COLUMNS table. The column list
is, however, available from sys.Columns.&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=109" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Architecture/default.aspx">Architecture</category></item><item><title>Rowset string concatenation: Which method is best?</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/rowset-string-concatenation-which-method-is-best.aspx</link><pubDate>Thu, 13 Jul 2006 01:15:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:77</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>25</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/77.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=77</wfw:commentRss><description>&lt;p&gt;Yeah, yeah, yeah, let's get this out of the way right from the start:
Don't concatenate rows into delimited strings in SQL Server. Do it
client side.
&lt;/p&gt;&lt;p&gt;
Except if you really have to create delimited strings in SQL Server.  In which case you should read on.
&lt;/p&gt;&lt;p&gt;
There was a little &lt;a href="http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=42415" target="#"&gt;discussion&lt;/a&gt; on &lt;a href="http://www.sqlteam.com/" target="#"&gt;SQLTeam&lt;/a&gt;
about the best way to concatenate. I recommended a scalar UDF solution,
whereas Rob Volk recommended a solution involving a temp table.
&lt;/p&gt;&lt;p&gt;I mentioned my dislike for the temp table solution for a couple
of reasons. First of all, it relies on a clustered index for ordering.
That will probably work in this example, but is not guaranteed to
always work and relying on indexes rather than ORDER BY for ordering is
definitely not a habit I want anyone to get into. The clustered index
as it was described in Rob's example also has another problem that I
didn't even notice until I was writing this entry. But I'll get to that
in a moment. The second reason I dislike the temp table is that I felt
it would be less efficient than the scalar UDF.
&lt;/p&gt;&lt;p&gt;
Rob didn't agree about the efficiency.  And so I set out to prove him wrong...
&lt;/p&gt;&lt;p&gt;
We'll use the Authors table in Pubs.  I want a comma-delimited list, per state, of the last name of each author who lives there.
&lt;/p&gt;&lt;p&gt;
First, the scalar UDF:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;USE pubs&lt;br&gt;GO&lt;br&gt;&lt;br&gt;CREATE FUNCTION dbo.ConcatAuthors(@State CHAR(2))&lt;br&gt;RETURNS VARCHAR(8000)&lt;br&gt;AS&lt;br&gt;BEGIN&lt;br&gt;	DECLARE @Output VARCHAR(8000)&lt;br&gt;	SET @Output = ''&lt;br&gt;&lt;br&gt;	SELECT @Output =	CASE @Output &lt;br&gt;				WHEN '' THEN au_lname &lt;br&gt;				ELSE @Output + ', ' + au_lname &lt;br&gt;				END&lt;br&gt;	FROM Authors&lt;br&gt;	WHERE State = @State&lt;br&gt;	ORDER BY au_lname&lt;br&gt;&lt;br&gt;	RETURN @Output&lt;br&gt;END&lt;br&gt;GO&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;
To find the list I want:
&lt;/p&gt;&lt;pre class="code"&gt;SELECT DISTINCT State, dbo.ConcatAuthors(State)&lt;br&gt;FROM Authors&lt;br&gt;ORDER BY State&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;... And the adaptation of Rob's temp table method... I did change
two things due to problems I discovered during testing. One, I've
altered the au_lname column to VARCHAR(8000); the column in the Authors
table is VARCHAR(40), not large enough for all of the California
authors. What if we were dealing with a much larger dataset? Second, I
added an IDENTITY column, and I'm clustering on that instead of the
actual data to get the ordering. I'm doing so because of the
VARCHAR(8000). Index rows can be a maximum of 900 bytes, so if we had
enough data to exceed that length, this method would fail.
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;CREATE TABLE #AuthorConcat&lt;br&gt;(&lt;br&gt;	State CHAR(2) NOT NULL,&lt;br&gt;	au_lname VARCHAR(8000) NOT NULL,&lt;br&gt;	Ident INT IDENTITY(1,1) NOT NULL PRIMARY KEY&lt;br&gt;)&lt;br&gt;&lt;br&gt;INSERT #AuthorConcat &lt;br&gt;(&lt;br&gt;	State,&lt;br&gt;	au_lname&lt;br&gt;)&lt;br&gt;SELECT&lt;br&gt;	State, &lt;br&gt;	au_lname&lt;br&gt;FROM Authors&lt;br&gt;ORDER BY &lt;br&gt;	State, &lt;br&gt;	au_lname&lt;br&gt;&lt;br&gt;DECLARE @Authors VARCHAR(8000)&lt;br&gt;SET @Authors = ''&lt;br&gt;DECLARE @State CHAR(2)&lt;br&gt;SET @State = ''&lt;br&gt;&lt;br&gt;UPDATE #AuthorConcat&lt;br&gt;SET @Authors = au_lname =	CASE &lt;br&gt;				WHEN @State = State THEN @Authors + ', ' + au_lname &lt;br&gt;				ELSE au_lname END,&lt;br&gt;	@State = State&lt;br&gt;&lt;br&gt;SELECT State, MAX(au_lname) &lt;br&gt;FROM #AuthorConcat&lt;br&gt;GROUP BY State&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;Clever, but more complex and harder to read than the scalar UDF
version. Output is identical, but that's not why we're here. Which one
is more efficient?
&lt;/p&gt;&lt;p&gt;
Drumroll, please...
&lt;/p&gt;&lt;p&gt;Results were tabulated using STATISTICS IO, STATISTICS TIME, and
Query Analyzer's Show Execution Plan. DBCC DROPCLEANBUFFERS and DBCC
FREEPROCCACHE were run before each test.
&lt;/p&gt;&lt;p&gt;
&lt;b&gt;Scalar UDF Method&lt;/b&gt;
&lt;br&gt;
Total cost: 0.0492
&lt;br&gt;
Total Scan count: 1
&lt;br&gt;
Total Logical reads: 2
&lt;br&gt;
Total Physical reads: 2
&lt;br&gt;
Total time: 25 ms
&lt;/p&gt;&lt;p&gt;
&lt;b&gt;Temp Table Method&lt;/b&gt;
&lt;br&gt;
Total cost: 0.2131
&lt;br&gt;
Total Scan count: 4
&lt;br&gt;
Total Logical reads: 9
&lt;br&gt;
Total Physical reads: 2
&lt;br&gt;
Total time: 88 ms
&lt;/p&gt;&lt;p&gt;So in conclusion, neither method is incredibly taxing with the
tiny Pubs dataset, but I think I have proven that the UDF is far more
efficient.
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;hr&gt;Update, February 28, 2005: Modified the adapation of Rob Volk's
method to use a CREATE TABLE instead of SELECT INTO, as the latter is
not necessarily guaranteed to insert rows in the right order for the
sake of this example. Thanks to "PW" on SQLServerCentral for pointing
this problem out. Note that this changed the total costs very slightly
-- for the better -- but the UDF still performs better by quite a large
margin.&lt;br&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=77" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Performance/default.aspx">Performance</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Architecture/default.aspx">Architecture</category></item><item><title>Caveats of the TEXT datatype</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2006/07/12/caveats-of-the-text-datatype.aspx</link><pubDate>Thu, 13 Jul 2006 01:02:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:72</guid><dc:creator>Adam Machanic</dc:creator><slash:comments>0</slash:comments><comments>http://www2.sqlblog.com/blogs/adam_machanic/comments/72.aspx</comments><wfw:commentRss>http://www2.sqlblog.com/blogs/adam_machanic/commentrss.aspx?PostID=72</wfw:commentRss><description>&lt;p&gt;Someone named "Krygim" posted the following question today in the microsoft.public.sqlserver.programming newsgroup:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;div class="quote"&gt;
"Will defaulting a TEXT field to an empty string take up more space
than defaulting it to Null when no data is entered into the field.
[sic]"
&lt;/div&gt;
&lt;p&gt;
Before reading any further, I ask that you think for a moment and consider what you think the answer should be...
&lt;/p&gt;&lt;p&gt;If you answered "yes", you're correct. An empty TEXT column will
take up more space than a NULL one -- quite a bit more space, as it
turns out. Here's the test code I posted to prove it:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;USE tempdb&lt;br&gt;&lt;br&gt;CREATE TABLE atable(acol TEXT NULL)&lt;br&gt;&lt;br&gt;INSERT atable(acol)&lt;br&gt;SELECT NULL&lt;br&gt;FROM 	pubs..authors a,&lt;br&gt;	pubs..authors b,&lt;br&gt;	pubs..authors c,&lt;br&gt;	pubs..authors d&lt;br&gt;&lt;br&gt;EXEC sp_spaceused 'atable'&lt;br&gt;&lt;br&gt;CREATE TABLE btable(acol TEXT NOT NULL)&lt;br&gt;&lt;br&gt;INSERT btable(acol)&lt;br&gt;SELECT ''&lt;br&gt;FROM	pubs..authors a,&lt;br&gt;	pubs..authors b,&lt;br&gt;	pubs..authors c,&lt;br&gt;	pubs..authors d&lt;br&gt;&lt;br&gt;EXEC sp_spaceused 'btable'&lt;br&gt;&lt;br&gt;DROP TABLE atable&lt;br&gt;DROP TABLE btable&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;Running this on my end tells me that the NULLs took up only 3 MB,
whereas the empty strings took up 33 MB -- 11 times more space used!
&lt;/p&gt;&lt;p&gt;
I did a quick scan of &lt;a href="http://www.amazon.com/exec/obidos/tg/detail/-/0735609985/qid=1098462012/sr=8-1/ref=sr_8_xs_ap_i1_xgl14/102-1036834-3046505?v=glance&amp;amp;s=books&amp;amp;n=507846" target="#"&gt;Inside Microsoft SQL Server 2000&lt;/a&gt; and discovered that for TEXT and IMAGE datatypes:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;div class="quote"&gt;
"If the amount of data is less than 32 KB, the text pointer in the data row points to an 84-byte text root structure."
&lt;/div&gt;
&lt;p&gt;This, in addition to the 16-byte pointer to the off-row location of
the data, explains the huge difference in size between these two
tables.
&lt;/p&gt;&lt;p&gt;At this point, you're probably either wondering how to fix this
or you've drifted away and are half-reading, half-thinking about what's
for dinner. If you're in the latter category and not also in the former
category, you either know the answer already or are not enough of a
certified geek to be reading my blog. And if you're in the former
category and not in the latter category, your priorities are certainly
skewed, as we all know that food is more important than any stupid DBMS
space wasted due to empty strings problem.
&lt;/p&gt;&lt;p&gt;But for those still following me, the answer is one of my
favorite about-to-be-deprecated features of SQL Server, the mighty
text-in-row option... Re-run the second part of the previous example,
with a new line inserted:
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;pre class="code"&gt;USE tempdb&lt;br&gt;&lt;br&gt;CREATE TABLE btable(acol TEXT NOT NULL)&lt;br&gt;&lt;br&gt;EXEC sp_tableoption 'btable', 'text in row', '24'&lt;br&gt;-- HINT: This is the new line&lt;br&gt;&lt;br&gt;INSERT btable(acol)&lt;br&gt;SELECT ''&lt;br&gt;FROM	pubs..authors a,&lt;br&gt;	pubs..authors b,&lt;br&gt;	pubs..authors c,&lt;br&gt;	pubs..authors d&lt;br&gt;&lt;br&gt;EXEC sp_spaceused 'btable'&lt;br&gt;&lt;br&gt;DROP TABLE btable&lt;br&gt;&lt;/pre&gt;
&lt;p&gt;There it is. Back down to 3 MB, even with the non-empty strings in
the TEXT column. By storing the small data in-row, we've eliminated the
16-byte off-row pointer and the 84-byte root structure, and other
assorted bytes that are used by the TEXT datatype that I don't know
about (the math didn't quite add up when I tried to calculate where all
of the space went).
&lt;/p&gt;&lt;p&gt;So what does this tell us? I'm thinking that as a best
practice, perhaps the text-in-row option should be used for every table
with LOB columns, and that it should be set to around 100 bytes. This
will still keep row sizes down when larger data is inserted (as it will
go off-row), but it will also keep overall IO way down if the amount of
rows with LOB data larger than 100 bytes is fairly sparse.
&lt;/p&gt;&lt;p&gt;
Comments?&lt;/p&gt;&lt;img src="http://www2.sqlblog.com/aggbug.aspx?PostID=72" width="1" height="1"&gt;</description><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/T-SQL/default.aspx">T-SQL</category><category domain="http://www2.sqlblog.com/blogs/adam_machanic/archive/tags/Architecture/default.aspx">Architecture</category></item></channel></rss>