THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

  • Capturing query and IO statistics using Extended Events

    The commands

    SET STATISTICS TIME ON
    SET STATISTICS
    IO ON

     

    return information about query executions and are very useful when doing performance tuning work as they inform how long a query took to execute and the amount of IO activity that occurred as a result of that query.

    These are very effective features however to my mind they do have a drawback in that the information they provide is not accessible in the actual query window from which the query was executed. This means the results cannot be collected, stored in a table, and then queried – such information would have to be manually copied and pasted from the messages pane into (say) a spreadsheet for further analysis.

    This is dumb. I’m a SQL Server developer, I want my data available so that I can bung it into a table in SQL Server and issue queries against it. That is why, a couple of weeks ago, I submitted a request to Microsoft Connect entitled Access to STATS TIME & STATS IO from my query in which I said:

    I recently was doing some performance testing work where I was evaluating the affect of changing various settings on a particular query. I would have liked to simply run my query inside a couple of nested loops in order to test all permutations but I could not do that because every time I executed the query I had to pause so I could retrieve the stats returned from STATISTICS IO & STATISTCS TIME and manually copy and paste (yes, copy and paste) the information into a spreadsheet.

    This feels pretty dumb in this day and age. Why can we not simply have access to that same information within my query? After all, we have @@ROWCOUNT, ERROR_MESSAGE(), ERROR_NUMBER() etc... that provide very useful information about the previously executed statement, how about @@STATISTICS for returning all the IO & timing info? We can parse the text returned by that function to get all the info we need.
    Better still, provide individual functions e.g.:
    @@QUERYPARSETIME
    @@QUERYCOMPILETIME
    @@QUERYEXECUTIONTIME
    @@SCANCOUNT
    @@LOGICALREADS
    @@PHYSICALREADS
    @@READAHEADREADS

    Ralph Kemperdick noticed my submission and correctly suggested that the same information could be accessed using Extended Events. Based on this I’ve written a script (below) that issues a series of queries against the AdventureWorks2012 sample database, captures similar stats that would be captured by SET STATISTICS then presents them back at the end of the query. Here are those results:

    image

    The information is not as comprehensive as what you would get from SET STATISTICS (no Read-Ahead Reads for example, and no breakdown of IO per table) but should be sufficient for most purposes.

    You can adapt the script accordingly for whatever information you want to capture, the important part of the script is the creation of the XEvents session for capturing the queries, then reading and shredding the XML results thereafter.

    Hope this is useful!

    @Jamiet

    UPDATE: Turns out you don't need all of this. I've just been informed that Richie Rump has written a parser at http://statisticsioparser.com/ that does all of this for you. Simple paste in your STATISTICS IO output and press the button - it will do all the hard work for you and give you the results back in a nice readable graph. You can paste in multiple results at once too.

    --Create the event session
    CREATE EVENT SESSION [queryperf] ON SERVER
    ADD EVENT sqlserver.sql_statement_completed
    ADD TARGET package0.event_file(SET filename=N'C:\temp\queryperf.xel',max_file_size=(2),max_rollover_files=(100))
    WITH MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,
                
    MAX_DISPATCH_LATENCY=120 SECONDS,MAX_EVENT_SIZE=0 KB,
                
    MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON);

    --Set up some demo queries against AdventureWorks2012 in order to evaluate query time & IO
    USE AdventureWorks2012
    DECLARE    @SalesPersonID INT;
    DECLARE    @salesTally INT;
    DECLARE    mycursor CURSOR FOR
    SELECT
    soh.SalesPersonID
    FROM   Sales.SalesOrderHeader soh
    GROUP  BY soh.SalesPersonID;
    OPEN mycursor;
    FETCH NEXT FROM mycursor INTO @SalesPersonID;
    ALTER EVENT SESSION [queryperf] ON SERVER STATE = START;
    WHILE @@FETCH_STATUS = 0
    BEGIN
           DBCC
    FREEPROCCACHE;
          
    DBCC DROPCLEANBUFFERS;
          
    CHECKPOINT;
          
    SELECT @salesTally = COUNT(*)
          
    FROM Sales.SalesOrderHeader  soh
          
    INNER JOIN Sales.[SalesOrderDetail] sod        ON  soh.[SalesOrderID] = sod.[SalesOrderID]
          
    WHERE SalesPersonID = @SalesPersonID
          
    FETCH NEXT FROM mycursor INTO @SalesPersonID;
    END
    CLOSE
    mycursor;
    DEALLOCATE mycursor;
    DROP EVENT SESSION [queryperf] ON SERVER;

    --Extract query information from the XEvents target
    SELECT q.duration,q.cpu_time,q.physical_reads,q.logical_reads,q.writes--,event_data_XML,statement,timestamp
    FROM   (
          
    SELECT  duration=e.event_data_XML.value('(//data[@name="duration"]/value)[1]','int')
           ,      
    cpu_time=e.event_data_XML.value('(//data[@name="cpu_time"]/value)[1]','int')
           ,      
    physical_reads=e.event_data_XML.value('(//data[@name="physical_reads"]/value)[1]','int')
           ,      
    logical_reads=e.event_data_XML.value('(//data[@name="logical_reads"]/value)[1]','int')
           ,      
    writes=e.event_data_XML.value('(//data[@name="writes"]/value)[1]','int')
           ,      
    statement=e.event_data_XML.value('(//data[@name="statement"]/value)[1]','nvarchar(max)')
           ,      
    TIMESTAMP=e.event_data_XML.value('(//@timestamp)[1]','datetime2(7)')
           ,       *
          
    FROM    (
                  
    SELECT CAST(event_data AS XML) AS event_data_XML
                  
    FROM sys.fn_xe_file_target_read_file('C:\temp\queryperf*.xel', NULL, NULL, NULL)
                  
    )e
          
    )q
    WHERE  q.[statement] LIKE 'select @salesTally = count(*)%' --Filters out all the detritus that we're not interested in!
    ORDER  BY q.[timestamp] ASC
    ;

  • Why don’t app stores offer subscriptions?

    Accepted wisdom when one purchases an app from a business store is that one gets free updates for life. This is, quite obviously, an unsustainable business model and I suspect is the main reason why so many apps use advertising to generate income.

    There is though, in the enterprise world at least, a move to a subscription-based business model (i.e. renting software) the most obvious examples of which are Office 365 and Adobe Creative Cloud and I’m left wondering why app stores don’t offer a similar option.

    Today I installed an app called Tweetium that offers a (paid for) premium option, here is why the premium option exists:

    image

    Again this strikes me as unsustainable. The customer pays once yet Tweetium has to pay TweetMarker every month. Forever. It doesn’t take an expert mathematician to realise that eventually Tweetium’s monthly outlay could exceed the income they have saved up from purchases.

    It seems to me there is a simple solution to all this. App stores could offer an option for customers to rent apps rather than buy them. Its more sustainable for the app vendor and the app store provider gets a more predictable income stream (which CFOs seem to like). Why don’t app stores not do this? Seems like a no-brainer to me

    Just a random thought for a Sunday morning.

    @Jamiet

    UPDATE: Apparently iOS & Android app stores *do* offer subscription models, I just wasn't aware of it.
  • My SQL Server gripe at SQL Supper

    On 17th February 2014 (3 days ago) I visited an event called SQL Supper held at Microsoft’s central London office, Cardinal Place. The event was basically a QnA session with Mark Souza, Conor Cunningham, Nigel Ellis, Hatay Tuna & Ewan Fairweather and one part of the evening was loosely termed the gripe session where the attendees were invited to stick their hand in the air and when asked have a good old whinge about something in SQL Server that, well, frankly pissed them off. Given the members of the panel this was inevitably focused on the database platform in SQL Server rather than the BI stuff and this is what I was only too happy to gripe about:

    Microsoft seem to have dropped the ball on database developer productivity, both in the language and the tooling. A decade ago this is something that SQL Server was renowned for, I put it to you that this is no longer the case.  SSDT came out with SQL Server 2012 and its a great tool, I love it, but in the two years since there have been various maintenance releases but hardly any new features. SSMS has hardly changed for years, extensibility is still not truly supported. Intellisense does not work properly 100% of the time. As far as I can recall T-SQL has had only two major features (TRY/CATCH & windowing functions) in the last ten years.

    Please fix this. Show database developers some love again.

    I could write pages and pages of gripes just under the banner of developer productivity but I’ll leave you with that concise summary. It is of course a matter of opinion, feel free to agree or disagree.

  • Dacpac braindump - What is a dacpac?

    In this week’s earlier blog post First release of my own personal T-SQL code library on Github I talked of how one could use a dacpac to distribute a bunch of code to different servers. Upon reading the blog post Jonathan Allen (of SQL Saturday Exeter fame), with whom I’ve been discussing dacpacs with on-and-off recently, sent me this email:

    Hi Jamie,

    The DacPac thing I emailed about in December hasnt taken off yet but I have just downloaded your code library to take a look and I like the way the dacpac works. Should I be able to open that in VS or is the dacpac compiled/built in VS? The video you linked to didnt cover dapac at all so I am in the dark on how to create one/them.

    If I can build a database and create a dacpac simply then this could be really useful.

    Jonathan’s email made me realise that there is perhaps a lot of confusion about what dacpacs are, what they can be used for and how they can be used so I figured a braindump of what I know about them might be useful, that’s what you’re getting in this blog post.

     

    What is a dacpac?

    A dacpac is a file with a .dacpac extension.

    image

    In that single file are a collection of definitions of objects that one could find in a SQL Server database such as tables, stored procedures, views plus some instance level objects such as logins too (the complete list of supported objects for SQL Server 2012 can be found at DAC Support For SQL Server Objects and Versions). The fact that a dacpac is a file means you can do anything you could do with any other file, store it, email it, share it on a file server etc… and this means that they are a great way of distributing the definition of many objects (perhaps even an entire database) in a single file. Or, as Microsoft puts it, a self-contained unit of SQL Server database deployment that enables data-tier developers and database administrators to package SQL Server objects into a portable artifact called a DAC package, also known as a DACPAC. That in itself is, I think, very powerful.

    Ostensibly a dacpac is a binary file so you can’t just open it up in your favourite text editor and look at the contents of it. However, what many people do not know is that the format of a dacpac is simply the common ZIP compression format and hence we can add .zip to the end of a dacpac filename:

    image

    and open it up like you would any other .zip file to have a look inside. If you do so you will see this:

    image

    The contents of that zip file conform to something called the Open Packaging Convention (OPC). OPC is a standard defined by Microsoft for, well, for zipping up files basically. You have likely used files conforming to OPC before without knowing it, docx, .xlsx, .pptx are the most common ones that you might recognise if you use Microsoft Office and there are some more obscure ones such as .ispac (SSIS2012 developers should recognise that). (For a more complete list of OPC-compliant file types see the wikipedia page).

    Notice in the screenshot above showing the innards of TSQLCodeLibrary.dacpac the biggest file is model.xml. This is the file that contains the definition of all our SQL Server objects. I won’t screenshot that here but I encourage you to get hold of a .dacpac file (here’s one) and have a poke around to see what’s in that model.xml file.

    What are dacpacs for?

    Dacpacs are used for deploying SQL Server objects to an instance of SQL Server. That’s it. If your job does not ever involve doing that then you probably don’t need to read any further.

    Dacpac pre-requisites

    A .docx file (i.e. A Microsoft Word document) isn’t much use to someone if they don’t have the software (i.e. Microsoft Word) to make use of it and so the analogy holds for dacpacs; in order to use them you need to have some software installed and that software is called the Data-tier Application Framework (or DAC Framework for short, or DacFx for even shorter).

    Incidentally, you may be wondering what DAC stands for at this point. I think its “Data-Tier Application” in which case you may be thinking that the acronym DAC is a stupid one especially as DAC also stands for something else in SQL Server, I would agree!

    DacFx is available to download for free however you’ll probably never need to do that as installation of DacFX occurs whenever you install SQL Server, SQL Server client tools or SQL Server Data Tools (SSDT). If DacFX is installed you should be able to see it in Programs and Features:

    image

    How does one deploy a dacpac?

    In dacpac nomenclature the correct term for deploying a dacpac is publishing however the two generally get used interchangeably. There are two methods of publishing a dacpac which I’ll cover below.

    Publish via SSMS

    In SSMS’s Object Explorer right-click on the databases node and select “Deploy Data-tier Application…” (told you they used those terms interchangeably):

    image

    This launches a wizard that prompts you to choose a dacpac, fill in some particulars (e.g. database name) and then deploy it for you by calling out to DacFx. Unfortunately this wizard is not very good because it doesn’t (currently) support all features of dacpacs, namely if your dacpacs contain any sqlcmd variables (I won’t cover those here but they are commonly used within dacpacs) a value needs to be supplied for them; the wizard doesn’t prompt you for a value and hence the deployment fails.

    This. Is. Stupid. Microsoft should be suitably lambasted for not providing this basic functionality. Anyway, due to this limitation you’re most likely to be using the other method which is…

    Publish via command-line

    One component distributed in DacFx is a command-line tool called sqlpackage.exe which will quickly become your best friend if you use dacpacs a lot. sqlpackage.exe can do a lot of things and those “things” are referred to as actions, one of those actions is publishing a dacpac. Here’s the syntax for publishing a dacpac using sqlpackage.exe:

    "%ProgramFiles(x86)%\Microsoft SQL Server\110 \DAC\bin\SqlPackage.exe"
          /action:Publish
          /SourceFile:<path to your dacpac>
          /TargetServerName:<SQL instance you are deploying to>
          /TargetDatabaseName:<Name of either (a)the database to create or (b) the existing database to deploy into>

    Publishing is idempotent

    Notice from my comment above for TargetDatabaseName that you can deploy to an existing database. You might ask why you might want to publish into an existing database, after all, the objects you are publishing might already exist. This segues nicely into what I see as the biggest benefit of dacpacs and DacFx, the software interrogates the target database to determine whether or not the objects already exist or not and if they do not it will create them. If they do already exist it will determine whether the definition has changed or not and if it has, it will make those changes. DacFx will always protect your data so if it determines that an operation would cause data destruction (e.g. removing a column from a table) then it will (optionally) throw an error and fail. You never again need to write an ALTER statement or first check that an object exists in order to change an object definition, DacFx will do it for you. To put it another way, publishing using dacpacs and DacFx is idempotent.

    How does one create a dacpac?

    Of course in order to publish a dacpac you’re first going to have to create one and one of Jonathan’s questions above pertained to exactly this. There are two distinct ways to do create a dacpac.

    Use an SSDT Project

    SQL Server Data Tools (SSDT) projects are basically a project type within Visual Studio that provide a way of building DDL for SQL Server databases. I’m not going to cover SSDT projects in any detail here except to say that when such a project is built the output is a dacpac. Note that SSDT can also publish the dacpac for you however I didn’t mention that above as the publish operation is essentially another wrapper around the same DacFx functionality used by sqlpackage.exe.

    Create from an existing database

    One can right-click on a database in SSMS and click on “Extract Data-tier Application…” to create a dacpac containing the definition of all objects in that database:

    image

    Wrap-up

    Should you be using dacpacs? I can’t answer that question for you but hopefully what I’ve done is given you enough information so that you can answer it for yourself. Some people might like the way dacpacs encapsulate many objects into a single file and their idempotent deployment, others may prefer good old simple, handcrafted T-SQL scripts which don’t have any pre-requisites other than SQL Server itself. The choice is yours.

    Further reading

    UPDATE

    David Atkinson from Redgate has been in touch to tell me about another dacpac feature that I didn’t know about. It is possible to right-click on a dacpac in Windows Explorer and choose to unpack it:

    image

    That essentially unzips it but what you also get is a file called Model.sql that will create all of the objects in the dacpac:

    image

    Very useful indeed! David tells me that Redgate use this functionality to enable comparison of a dacpac using their SQL Compare tool as you can read about at Using a DACPAC as a data source.

  • First release of my own personal T-SQL Code Library on github

    Like many (most???) T-SQL developers I keep a stash of useful code that I’ve garnered down the years because I know its all going to come in useful at some point in the future. It includes code I’ve written myself and also code that others have shared on their own blogs. For example my code library includes the following:

    I’ve never seen the point of keeping one’s code library to one’s self, might as well share it in case anyone else might find it useful, so up to now I’ve kept my collection of scripts publicly available on SkyDrive (go see it if you like).

    That’s all fine and dandy but I figured this could be improved. SkyDrive is a file sharing site and whilst it includes a nice code viewer/editor it is not an ideal solution for storing code, code should be stored in a version control system (e.g. Git, TFS, Subversion, etc..). I opted to make my code library available on Github at https://github.com/jamiekt/TSQLCodeLibrary/ because it provides:

    • file version history
    • releases
    • ability for anyone else to fork my code library and build upon it to maintain their own code library
    • lots of tools necessary for modern code development

    and moreover all the cool kids seem to be using Github so I figured I’d give it a bash as well.

    The code library exists as a collection of views, functions and stored procedures in an SSDT project. I’m a massive fan of SSDT so there were many reasons for my choosing to do this but the overriding reason was that SSDT provides a single binary (i.e. a dacpac file) containing the entire code library that can be distributed as easily as emailing the file to someone. Deploying a dacpac is pretty simple and so is a great method for sharing T-SQL code.

    What’s in my T-SQL code library?

    In this first release, not much. There are only nine objects though I hasten to add that this is only a first release and I have a backlog of stuff that I need to add in there. One of the many advantages of using SSDT is that it makes it easy to add extended properties to describe the objects and the code library includes a view that surfaces all of that extended property information:

    SELECT schema_name,object_name,object_type,CodeLibraryDescription
    FROM jt.[vwCodeLibraryDescriptions]

    20140112TSQLCodeLibraryv0.1_objects

    How do you install the code library?

    Download the two binaries:

    • master.dacpac
    • TSQLCodeLibrary.dacpac

    and store them together in a folder. Open a command prompt at that folder and type:

    "%ProgramFiles(x86)%\Microsoft SQL Server\110\DAC\bin\SqlPackage.exe"
     /action:Publish
     /SourceFile:TSQLCodeLibrary.dacpac
     /TargetServerName:"<your_sql_instance>"
     /TargetDatabaseName: <prefered_database_name>

    (replacing <your_sql_instance> with the name of the SQL Server instance where you want to create the code library and <prefered_database_name> with whatever you want the database to be called. Get rid of the line feeds as well, they are just used here for clarity)

    This will create a SQL Server database containing my code library:


    20140112DeployedTSQLCodeLibraryv0.1

    If any of the code in my code library proves useful to you then that’s great however my wish here is that some of you other folk out there feel motivated to share your own code in a similar manner. If you do so please post a comment below and let me know.

    @Jamiet

  • Schema Compare or Publish [SSDT]

    Yesterday on Twitter Ryan Desmond asked Is there a good read for #SSDT regarding deploying changes via schema compare vs solution deployment?

    image

    I don’t know of any article that covers this so in this blog post I offer my opinion on the subject.

    First some background. When building databases offline using the database project type (.sqlproj) in SSDT you have two options for deploying the DDL code in your project into a physical database:

    • Schema Compare
    • Publishing

    Under the covers both do the same basic operation; build a dacpac from your project, compare it to the target database, build a script that will make the requisite changes to the target database, execute that script.

    Ryan was asking which of these one should use. I suggested that publishing was a better option and here are two reasons why:

    1. Publish will include your pre and post deployment scripts as well whereas Schema Compare will not. (And if your retort is that you cannot run those scripts more than once then you’re doing it wrong, rewrite them.)
    2. If the debug target for your project is configured correctly then a publish operation can be triggered simply by pressing F5. That’s massively more productive than the point-and-click nature of Schema Compare. Its even better if you have multiple SSDT projects in your solution as you can publish all of them with a single key stroke.

    Does anyone out there have a different opinion? Let me know in the comments.

    @Jamiet

  • Beware the ß

    I stumbled upon an interesting little nuance of SQL Server’s behaviour over the past few days that I was not aware of and I figure its worth talking about it here so others are aware. It concerns the handling of the german character “ß” which I understand to be german shorthand for “ss” (I’m no expert on language or linguists so don’t beat me up if I’m wrong about that).

    In short, two NVARCHAR values that differ only by one using “ß” and the other using “ss” will be treated as the same. This is easily demonstrated using the following code snippet:

    SELECT 'Oktoberstrasse'
    UNION
    SELECT
    'Oktoberstraße';
    SELECT N'Oktoberstrasse'
    UNION
    SELECT
    N'Oktoberstraße';

    which returns:

    image

    (Collation on my database is set to ‘SQL_Latin1_General_CP1_CI_AS’)

    Notice that casting the values as NVARCHAR (which is what the N'' notation does) causes SQL Server to treat them as the same. Obviously this is going to cause a problem if you need to treat those as distinct values (such as inserting into a column with a unique key upon it – which is the problem I encountered that caused me to stumble across this)

    There is a bug submission to Connect regarding this issue at 'ß' and 'ss' are NOT equal in which a member of the SQL Server team says:

    Our current behavior follows the SQL/ISO standard and unless those standards are updated with the latest changes we don't intend to change the behavior in SQL Server. Changing existing SQL Server behavior has lot of implications and today we rely on Windows for all of our windows collation sorting capabilities. If and when in the future Windows adopts these new rules / Unicode standard we will incorporate it in SQL Server.

    In other words, SQL Server is merely following the behaviour as defined by the International Standards Organisation so its not a bug in SQL Server as such, just a nuance that one needs to be aware of. And now you are.

    @Jamiet

  • Really useful Excel keyboard shortcuts

    I love me a good keyboard shortcut and there’s some I would like to use in Excel all the time if only I could darn well remember them.

    CTRL+ Spacebar   Select the entire column containing the currently selected cell(s)
    Shift + Spacebar   Select the entire row containing the currently selected cell(s)
    CTRL + ‘+’   Insert cells
    CTRL + ‘-’   Delete cells

    If you combine these you’ll find they become really powerful. For example, CTRL+Spacebar followed by CTRL + ‘+’ inserts a new column into a worksheet (which is what I wanted to do this morning when I began googling this).

    I figured the only way I will ever ingrain these into my muscle memory is if I write them down somewhere, nowhere better than on my blog.

    If you’re a fellow keyboard shortcut fetishist and want to share any obscure ones that you know of then I’m all ears, please put them in the comments below.

    @Jamiet

  • Linqpad – bring the power of LINQ to SQL Server development

    One of my biggest laments about being a SQL Server developer is that the tools provided by Microsoft to do our thang compare woefully to the feature rich, (largely) bug-free, “it-just-works” impression that one gets from seeing the tools available to “other” developers in the Microsoft ecosystem (I’m thinking of our appdev brethren here, the ones that use C#, Javascript, and all that jazz). I could write a long blog post complaining about this (and perhaps one day I will) but in this blog post I want to shine a light on a tool called Linqpad.

    Linqpad enables you to issues queries written in LINQ against a database and in that respect is an alternative to SQL Server Management Studio (SSMS). What is LINQ? It stands for Language Integrated Query and is a technology that Microsoft brought out a few years ago for writing queries inside .Net code. The great thing about Linqpad is that it enables one to write LINQ queries without having to know anything about .Net.

    In the screenshots below I show a simple query against a database but written two ways, one using T-SQL that anyone reading this blog post will likely know, and one using LINQ:

    SNAGHTML618728e4

    Some things to notice here. The two queries look very similar in that they contain the same keywords {SELECT, FROM}. Second thing to notice is that the FROM clause comes before the SELECT clause and if you know anything about the logical order of execution of a SELECT query you’ll realise that this intuitively makes sense. Lastly the table is called [dbo].[BulletinLine] but in the LINQ query its called [BulletinLines], its been pluralised (a convention that is common to .Net developers) and there’s no [dbo] prefix. Other than those things its intuitively clear that these two queries are doing exactly the same thing and its worth pointing out that under the covers the LINQ query is converted into a T-SQL query.

    So OK, if you accept that LINQ can do pretty much anything that a T-SQL SELECT query can do the next obvious question is “Why should I bother when T-SQL already does what I need?” The answer, in a word, is productivity. Or, to put it another way, intellisense works properly. Let’s say for example I want to select a subset of all the columns, intellisense comes to our aid:

    image

    One might well retort “well that works in SSMS as well” but in my experience intellisense in SSMS is, at best, flaky. In some circumstances it simply doesn’t work and the worst part of this is that its often inexplicable as to why. (In case you can’t tell, intellisense in SSMS drives me up the wall and I’m sure I’m not the only one.)

    Some other nice things about LINQ. Here’s the equivalent of a WHERE clause to filter on [BulletinId]=6:

    image

    If you don’t know LINQ then (in my opinion) its not intuitively obvious what’s going on here. What the above query is doing can effectively be described as:

    Take the collection of BulletinLines, filter in WHERE BulletinId equals 6

    Where this gets really powerful is the ability to stack these things up like so:

    image

    Take the collection of BulletinLines, filter in WHERE BulletinId equals 6. From the resultant collection filter in WHERE Colour=”White”

    If we only want the top 3:

    image

    Take the collection of BulletinLines, filter in WHERE BulletinId equals 6. From the resultant collection filter in WHERE Colour=”White”. From the resultant collection take the first 3

    I love how expressive this is and when you get fully conversant with LINQ its wonderfully intuitive too. If I haven’t convinced then, well, that’s OK. If you’re not convinced but do want to be convinced then check out Why LINQ beats SQL. The overriding argument there is “LINQ is in most cases a significantly more productive querying language than SQL”. Make sure you check out the section entitled “Associations” too which demonstrates how you can filter based on what is known about an entity in other tables, when you grok what’s going on there you’ll realise the power that LINQ brings.

    If you want to give this a go then go and download Linqpad now from http://www.linqpad.net/. Its free however some of the more advanced features (such as intellisense) only light up when you pay for the pro or premium edition which are priced at $39 & $58 respectively for a perpetual license, a bit more than the app generation are used to paying but hardly bank-breaking either.

    Are any other SQL developers out there using Linqpad? Please share your experiences in the comments below, I’d really like to read them.

    @Jamiet

  • Don’t learn SSDT, learn about your databases instead

    Last Thursday I presented my session “Introduction to SSDT” at the SQL Supper event held at the offices of 7 Digital (loved the samosas, guys). I did my usual spiel, tour of the IDE, connected development, declarative database development yadda yadda yadda… and at the end asked if there were any questions. One gentleman in attendance (sorry, can’t remember your name) raised his hand and stated that by attempting to evangelise all of the features I’d missed the single biggest benefit of SSDT, that it can tell you stuff about database that you didn’t already know.

    I realised that he was dead right. SSDT allows you to import your whole database schema into a new project and it will instantly give you a list of errors and/or warnings pertaining to the objects in your database. Invalid references (e.g a long-forgotten stored procedure that refers to a non-existent column), unnecessary 3-part naming, incorrect case usage, syntax errors…it’ll tell you about all of ‘em! Turn on static code analysis (this article shows you how) and you’ll learn even more such as any stored procedures that begin with “sp_”, WHERE clauses that will kill performance, use of @@IDENTITY instead of SCOPE_IDENTITY(), use of deprecated syntax, implicit casts etc…. the list goes on and on.

    I urge you to download and install SSDT (takes a few minutes, its free and you don’t need SQL Server or Visual Studio pre-installed), start a new project:

    image

    right-click on your new project and import from your database:

    image

    and see what happens:

    image

    You may be surprised what you discover. Let me know in the comments below what results you get, total number of objects, number of errors/warnings, I’d be interested to know!

    @Jamiet

  • Clone an Azure VM using Powershell

    In a few months time I will, in association with Technitrain, be running a training course entitled Introduction to SQL Server Data Tools. I am currently working on putting together some hands-on lab material for the course delegates and have decided that in order to save time in asking people to install software during the course I am simply going to prepare a virtual machine (VM) containing all the software and lab material for each delegate to use. Given that I am an MSDN subscriber it makes sense to use Windows Azure to host those VMs given that it will be close to, if not completely, free to do so.

    What I don’t want to do however is separately build a VM for each delegate, I would much rather build one VM and clone it for each delegate. I’ve spent a bit of time figuring out how to do this using Powershell and in this blog post I am sharing a script that will:

    1. Prompt for some information (Azure credentials, Azure subscription name, VM name, username & password, etc…)
    2. Create a VM on Azure using that information
    3. Prompt you to sysprep the VM and image it (this part can’t be done with Powershell so has to be done manually, a link to instructions is provided in the script output)
    4. Create three new VMs based on the image
    5. Remove those three VMs

    SNAGHTML16add98b

    The script has one pre-requisite that you will need to install, Windows Azure Powershell. You also need to be a Windows Azure subscriber which, if you're reading this blog post, I'm assuming you already are.

    Simply download the script and execute it within Powershell, assuming you have an Azure account it should take about 20minutes to execute (spinning up VMs and shutting the down isn’t instantaneous). If you experience any issues please do let me know.

    There are additional notes below.

    Hope this is useful!

    @Jamiet 

    Notes:

    • Obviously there isn’t a lot of point in creating some new VMs and then instantly deleting them. However, this demo script does provide everything you need should you want to do any of these operations in isolation.
    • The names of the three VMs that get created will be suffixed with 001, 002, 003 but you can edit the script to call them whatever you like.
    • The script doesn’t totally clean up after itself. If you specify a service name & storage account name that don’t already exist then it will create them however it won’t remove them when everything is complete. The created image file will also not be deleted. Removing these items can be done by visiting http://manage.windowsazure.com.
    • When creating the image, ensure you use the correct name (the script output tells you what name to use):

    image

    • Here are some screenshots taken from running the script:

    image

    image

    • When the third and final VM gets removed you are asked to confirm via this dialog:

    image

    Select ‘Yes’

  • SQL Relay 2013R2 – Bigger, faster, SQL-ier

    Fresh from a successful UK tour in June the SQL Relay team are at it again. This time there are ten venues all around the UK between 11th and 29th November.

    image

    Head to http://www.sqlrelay.co.uk/ to sign-up. I’ll be at the London one, hope to see you there!

    @Jamiet

  • SSDT gotcha – Moving a file erases code analysis suppressions

    I discovered a little wrinkle in SSDT today that is worth knowing about if you are managing your database schemas using SSDT. In short, if a file is moved to a different folder in the project then any code analysis suppressions that reference that file will disappear from the suppression file. This makes sense if you think about it because the paths stored in the suppression file are no longer valid, but you probably won’t be aware of it until it happens to you. If you don’t know what code analysis is or you don’t know what the suppression file is then you can probably stop reading now, otherwise read on for a simple short demo.

    Let’s create a new project and add a stored procedure to it called sp_dummy.

    image

    Naming stored procedures with a sp_ prefix is generally frowned upon and hence SSDT static code analysis will look for occurrences of this and flag them. So, the next thing we need to do is turn on static code analysis in the project properties:

    image

    A subsequent build causes a code analysis warning as we might expect:

    image

    Let’s suppose we actually don’t mind stored procedures with sp_ prefixes, we can just right-click on the message to suppress and get rid of it:

    image

    That causes a suppression file to get created in our project:

    image

    Notice that the suppression file contains a relative path to the file that has had the suppression placed upon it. Now if we simply move the file within our project to a new folder notice that the suppression that we just created gets removed from the suppression file:

    image

    As I alluded above this behaviour is intuitive because the path originally stored in the suppression file is no longer relevant but you’re probably not going to be aware of it until it happens to you and messages that you thought you had suppressed start appearing again. Definitely one to be aware of.

    @Jamiet 

     

  • Enforce SSIS naming conventions using BI-xPress

    A long long long time ago (in 2006 in fact) I published a blog post entitled Suggested Best Practises and naming conventions in which I suggested a bunch of acronyms that folks could use to prefix object names in their SSIS packages, thus allowing easier identification of those objects in log records, here is a sample of some of those suggestions:

    SNAGHTMLf8e56b8

    If you have adopted these naming conventions (and I am led to believe that a bunch of people have) then you might like to know that you can now check for adherence to these conventions using a tool called BI-xPress from Pragmatic Works. BI-xPress includes a feature called the Best Practices Analyzer that scans your packages and assess them according to some rules that you specify. In addition Pragmatic Works have made available a collection of these rules that adhere to the naming conventions I specified in 2006

    image

    You can download this collection however I recommend you first read the accompanying article that demonstrates the capabilities of the Best Practices Analyzer. Pretty cool stuff.

    @Jamiet

  • SSIS ReportingPack v1.1.1.0 – Execution Stats visualised in a Gannt chart via usp_ssiscatalog

    There are many SSRS reports in SSIS Reporting Pack and the one I’ve perhaps spent most time on is the executable_duration report which provides this visualisation:

    image

    I happen to think this is really useful because it shows the start time and duration of each executable relative to all the other executables in an execution. It does it by leveraging the execution_path property of an executable (which I am on record as saying is, in my opinion, the best feature in SSIS 2012).

    I have wanted to provide a similar visualisation in usp_ssiscatalog for some time and today I have finally gotten around to implementing it. With this new release one can call up the executable stats using this query:

    EXEC usp_ssiscatalog
          
    @a                      =  
    'exec'
      
    ,   @exec_execution         =  
    0
      
    ,   @exec_events            =  
    0
      
    ,   @exec_errors            =  
    0
      
    ,   @exec_warnings          =  
    0
      
    ,   @exec_executable_stats  =  
    1

    And the returned dataset will include a new column called Gannt. This screenshot demonstrates the value that this provides:

    20130905gannt

    On the left we have the execution_path of each executable and on the right is a bar indicating the start time and duration relative to all the other executables. Collectively these bars provide a nifty Gannt chart-alike view of your execution thus allowing you to easily identify which executables are taking up the most time in your executions.

    In addition there is a similar view for the list of executions that is gotten by executing:

    EXEC usp_ssiscatalog @a='execs'

    That return a new column, relative_duration:

    SNAGHTML16daf47b

    which simply provides a bar for each execution indicating how long it took relative to all the other executions.

    I hope these new visualisations in usp_ssiscatalog prove useful to you. If they do (or even if they don’t) please let me know, I love getting feedback and I don’t get nearly as much as I would like for SSIS Reporting Pack. Get the goods from SSIS Reporting Pack v1.1.1.0.

    @Jamiet

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement