THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Adam Machanic

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

The Almighty Transaction (A Month of Activity Monitoring, Part 12 of 30)

This post is part 12 of a 30-part series about the Who is Active stored procedure. A new post will run each day during the month of April, 2011. After April all of these posts will be edited and combined into a single document to become the basis of the Who is Active documentation.


Virtually everything that SQL Server does is tied to a transaction. Therefore, it is of utmost importance when monitoring to be able to find out what’s going on with the various transactions that are active on your SQL Server instances.

In its default mode Who is Active shows any session with an active transaction. This state is reflected in the [open_tran_count] column, the value of which is the depth of nested transactions that have been started—implicitly or explicitly—on the request or session.

--SPID 53
BEGIN TRAN
BEGIN TRAN
BEGIN TRAN
BEGIN TRAN
GO

F12_01_open_tran_count

Here the [open_tran_count] is 4, because there are four nested transactions. These transactions could have been created at various levels of scope—e.g. inside of nested stored procedures—and it is not uncommon to see cases where not enough commits are issued to match the numbered of transactions that have been started.

There are a number of transaction-related DMVs, all of which are named using the sys.dm_tran prefix. Who is Active currently uses three of these to expose additional information beyond the open transaction count. To see more information, enable the @get_transaction_info option. This option adds two columns to the Who is Active output:

  • The [tran_log_writes] column includes information about any database that has been written to on behalf of the transaction. The format for this data is a comma-delimited list, in the following form per database:
    • {database name} {number of log records written} ({size of log records} kB)
  • The [tran_start_time] column reflects the time that the first database was written to on behalf of the transaction. This is perhaps a bit counter-intuitive, but the idea is simple: for the most part it’s not interesting to see a lot of information about read-only transactions. Millions of them start and finish every day on the average SQL Server instance. Transactions that are actually doing some work—writing something—are the ones that tend to cause the issues. Therefore, Who is Active focuses its energy there.

To see these two columns in action, leave session 53 active, and create a couple of test databases to play with:

USE master
GO

CREATE DATABASE test_db1
GO

CREATE DATABASE test_db2
GO

CREATE TABLE test_db1.dbo.t1
(
    i INT
)
GO

CREATE TABLE test_db2.dbo.t2
(
    i INT
)
GO

Once these are created, we can insert some rows into one of the tables inside of a multi-statement transaction...

BEGIN TRAN
GO

INSERT test_db1.dbo.t1
(
    i
)
SELECT
    number
FROM master..spt_values
WHERE
    type = 'p'
GO

The INSERT results in 2048 rows inserted into test_db1.dbo.t1. And since we haven’t committed the transaction yet, it’s still alive for Who is Active to catch.

EXEC sp_WhoIsActive
    @get_transaction_info = 1

F12_02_start_time

Session 52—the one on which I’ve just inserted the rows, has information in both the [tran_log_writes] and [tran_start_time] columns. Session 53, on the other hand, has NULLs for both columns—even though it’s still holding an open transaction. The reason is that session 52 has actually done a write in test_db1, whereas session 53 has not done any work at all.

So far only one database has been touched in the transaction. Adding another database to the mix adds that database to the [tran_log_writes] output:

INSERT test_db2.dbo.t2
(
    i
)
SELECT
    number
FROM master..spt_values
WHERE
    type = 'p'
GO

F12_02_two_dbs

The same amount of work has been done in both databases: 2048 rows inserted into a table. And we can see that this has produced the same number of transaction log records, and the same amount of transaction log space is being consumed in the log files of each database. That makes a lot of sense. It also makes sense that we’re seeing 2049 log records, because that corresponds to one log header record and one log record for each of the 2048 affected rows.

Taking this a bit further makes things much very interesting for those of you doing ETL: consider that in any situation in which minimal logging is not taking place, the number of log records that will be produced due to an INSERT, UPDATE, or DELETE, will loosely correspond to the number of rows being affected, in each index being touched by the update plan. Understanding how this works becomes very useful when monitoring large ETL processes. It’s often the case that it’s possible to make a fairly precise estimation about how many rows will be touched by each phase of a data load. By taking that number into account and taking a look at the number of indexes in the query plan, estimating how much longer the ETL will take becomes a simple matter of querying a few times to understand the rate at which rows are being written, and then doing a bit of multiplication.

As a final note, it's important to realize that there may be log writes in tempdb that cannot be reported by Who is Active due to the way the DMVs treat nested internal transactions. These writes are due to query processor operations such as sorts, hashes, and spools, and will be reflected in Who is Active's output in the [tempdb_allocations] column.

 

Homework

Yesterday I mentioned that the XML data type can cause plan collection to fail. SQL Server’s XML data type has a nesting limitation—nodes within nodes—of 128 levels. Showplan XML is quite verbose and nested on its own, so it happens to be quite easy to come up with a plan that will exceed the limitation. The following query returns a query whose plan will exceed 128 levels of depth, as long as you set the value of the @queries variable at or above 16:

DECLARE @queries INT = 16

SELECT
    REPLICATE(CONVERT(VARCHAR(MAX), '(SELECT TOP(1) '), @queries) +
        'low ' +
            REPLICATE(CONVERT(VARCHAR(MAX), 'FROM master..spt_values) '), @queries)

To see this in action, cut and paste the resultant query into the same batch with the following:

--Paste the query here
SELECT
    q.*
FROM sys.dm_exec_requests AS r
CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) AS q
WHERE
    r.session_id = @@SPID

For values of @queries at or below 15, the [query_plan] column will return a valid XML query plan. For values at or above 16, the [query_plan] column will return NULL. In most cases Who is Active uses the sys.dm_exec_text_query plan function rather than the sys.dm_exec_query_plan function. The text version does actually return a plan in this case, but in text format rather than XML format. This will not be returned by Who is Active, because the procedure always sends back the XML version of the plan, especially given that the XML can be rendered directly to a graphical format in SQL Server 2008. To see the text plan returned by sys.dm_exec_text_query_plan, it must be saved to a .SQLPLAN file, just like when viewing XML query plans in SQL Server 2005.

Today’s task: Prove that my final note above is true. Create a read-only query using the AdventureWorks database that causes Who is Active’s [tempdb_allocations] column to report the largest amount of activity that you can.

Published Tuesday, April 12, 2011 11:00 AM by Adam Machanic

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Adam Machanic said:

This post is part 13 of a 30-part series about the Who is Active stored procedure. A new post will run

April 13, 2011 10:44 AM
 

Adam Machanic said:

This post is part 30 of a 30-part series about the Who is Active stored procedure. A new post will run

April 30, 2011 11:45 AM

Leave a Comment

(required) 
(required) 
Submit

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement