THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Michael Zilberstein

LAG function – practical use and comparison to old syntax

Recently I had to analyze huge trace – 46GB of trc files. Looping over files I loaded them into trace table using fn_trace_gettable function and filters I could use in order to filter out irrelevant data. I ended up with 6.5 million rows table, total of 7.4GB in size. It contained RowNum column which was defined as identity, primary key, clustered. One of the first things I detected was that although time difference between first and last events in the trace was 10 hours, total duration of all sql events was only 4 hours. So the next assumption to check was whether application got stuck - whether we have significant gap between two subsequent events. In the following query I’m looking for events that occurred at least 30 seconds after previous event’s completion.

SQL 2005/2008/2008R2 syntax:

SELECT
  
curEvent.RowNum, prevEvent.EndTime AS PrevEventEndTime, curEvent.StartTime AS CurEventStartTime
,
  
DATEDIFF(SECOND, prevEvent.EndTime, curEvent.StartTime) AS
ApplicationStuckDuration
FROM
  
dbo.TraceData curEvent
  
INNER JOIN dbo.TraceData prevEvent ON curEvent.RowNum - 1 =
prevEvent.RowNum
WHERE DATEDIFF(SECOND, prevEvent.EndTime, curEvent.StartTime) >=
30

image

SQL 2012 syntax:

;WITH GapsCTE AS
(
  
SELECT RowNum, StartTime AS CurEventStartTime, LAG(EndTime, 1, NULL) OVER(ORDER BY RowNum ASC) AS
PrevEventEndTime
  
FROM
dbo.TraceData
)  
SELECT RowNum, PrevEventEndTime, CurEventStartTime, DATEDIFF(SECOND, PrevEventEndTime, CurEventStartTime) AS
ApplicationStuckDuration
FROM
GapsCTE
WHERE DATEDIFF(SECOND, PrevEventEndTime, CurEventStartTime) >=
30

image

Here are results (pretty surprising at the first glance, I have to say):

  SQL 2012 SQL 2005/2008/2008R2
Duration 103 seconds 77 seconds
Reads 919,752 1,839,504
CPU 20 seconds 13 seconds

In the old SQL versions we see twice reads – that’s obvious from execution plan since we scan entire table twice and merge join on RowNum column. It seems that in this case Merge Join is so efficient that CPU overheard of segmentation, aggregation and spooling costs more than additional single ordered scan of the table that is already in memory anyway.

I started to write all this in order to demonstrate practical usage of the new function and how we can benefit from it compared to previous SQL Server versions. Results became a complete surprise for me.

Conclusion: always check and recheck yourself!

Published Wednesday, March 14, 2012 10:33 AM by Michael Zilberstein

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement