THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kevin Kline

Understanding SQLIOSIM Output

One of the things that’s great about my job is the fact that I get to meet people who are a lot smarter than me.  One such person I count among my friends is Jimmy May, a Microsoftee from the Indianapolis area.  Jimmy has been using SQLIOSIM for a while, but has discovered that its reporting leaves a bit to be desired.  Jimmy and I were discussing the reporting characteristics of SQLIOSIM and he thought it’d be useful to share his findings with the public.  Thanks for sharing, Jimmy!

 

~~~~~

 

One thing which threw me off at first is the all-important “Target I/O” parameter.  As I understand it—be clear that I’m paraphrasing here—SQLIOSim tries to hammer the system trying to get to that threshold.  In fact, a well-tuned system with lots of capacity will resist high values.  Note my comment below loosely comparing it to disk latency.

 

 

SQLIOSim Sources

KB: http://support.microsoft.com/kb/231619/en-us

SQL Server Storage Engine blog: http://blogs.msdn.com/sqlserverstorageengine/archive/2006/10/06/SQLIOSim-available-for-download.aspx

PSS SQL Server Engineers:  http://blogs.msdn.com/psssql/archive/2007/04/10/sqliosim-outstanding-i-o-warnings.aspx

 

SQLIOSim Amalgamation of Undocumented Info

Characterization of Output:

Amalgamated from Bob Dorr & George Reynya

 

Target IO Duration (ms)

A threshold--SQLIOSim tries to hammer the system yet tries to stay under that threshold

If the average I/O duration exceeds the target I/O duration, the SQLIOSim utility throttles the number of outstanding I/O operations to decrease the load and to improve I/O completion time.

Pasted from <http://support.microsoft.com/kb/231619>

 

Running Average IO Duration (ms)

Low values are good

[Bob Dorr] Characterized as "Response time for the stress test"

[Bob Dorr] Target IO is 100ms; since ideal I/O is 4-8ms, anything higher is not a preferred outcome

[JWM] Though strictly speaking this isn't "disk latency" (Avg. Disk sec/[Read|Write|Transfer], acceptable  values are similar:

Data files:  0-5ms or so are excellent

Log files:  0-2ms are acceptable

Warnings for I/Os exceeding 15s (à la SP4 diags)

Algorithm is designed to mitigate big swings, esp. because of one very large I/O (similar to SQL Server Checkpoint logic)

Used to control the number of outstanding I/O requests

 

IO request blocks

High numbers are good

The more requests a system can handle, the more I/Os are issued by the tool

Characterized as "Concurrent IOs"

Capped by MaxIO; not uncommon to have 100 or more, esp. for Enterprise Edition for which read-aheads & such are attempted

 

Accumulators:  Reads, Scatter Reads, Writes, Gather Writes

High numbers are good

Simple Accumulators

Each time an operation is posted, the category for the specific request type is incremented

 

Total IO Time (ms)

Desirable numbers are relative

Useful for comparing drives, especially on the same system. 

If the values for the Accumulators are similar for two drives, but one drive has significantly less Total IO Time,

then the latter is more performant because it was able to service the same I/O in less time.

Characterized as "Accumulated total duration time for the I/O in ms"

The same duration is used to track the "Running Average IO Duration (ms)"

Sum of accumulators * running average = ~Running Average IO Duration

 

Number of times IO throttled

Low numbers are better

Number of times a request was removed because the duration was exceeded

 

Additional Documentation

Additional information will be based on demand & will be added to KB article & Wiki

 

Architecture

SQLIOSim is built with lightweight UMS/SQLOS-like layer to better simulate SQL Server, incl. logical CPUs

Output

Output Example #1:

Data

********** Final Summary for file S:\sqliosim.mdx **********

Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 17, Number of times IO throttled = 9006, IO request blocks = 33

Reads = 75414, Scatter Reads = 88286, Writes = 3295, Gather Writes = 54845, Total IO Time (ms) = 44936446

********** Final Summary for file T:\sqliosim.mdx **********

Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 0, Number of times IO throttled = 4135, IO request blocks = 106

Reads = 67091, Scatter Reads = 81249, Writes = 2501, Gather Writes = 53652, Total IO Time (ms) = 13033268

Interpretation

[George Reynya] The running average of 17ms for S: does not look all that bad as a target as it means that over the life of the test you maintained 17ms response time for the stress test.  Your test shows that the number of concurrent IOs has dropped to 33 which is not so good.  Especially when you compare the same basic work load against the T: drive--0ms w/ 106 I/O requests.   I would look at the S drive IO path a bit to determine the differences.

[JWM] Note:  The trial lasted 300s.  The interpretation is consistent with physical layout:  Disks were SAS DAS 73GB 15K RAID 10; S: was comprised of 6 disks, T: of 8 disks.

[JWM] Note:  George's comment regarding 17ms vs. 33ms was relative; in fact, a neither value would not be acceptable to me—I’m shooting for <5ms for my data drives.

Output Example #2:

Data

Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 121, Number of times IO throttled = 1167, IO request blocks = 13

Interpretation

[George Reynya]  Customer disk system exceeded Target IO Duration by 21ms for this file and had to throttle 1167 times in an attempt to keep the IO within the target.    With the number of IOs we do for a file this is a bit concerning.  We tried to throttle several times and they could not keep the average below the 100ms target.

Output Example #3:

Data

04/27/07 14:24:55 ********** Final Summary for file M:\sqliosim.mdx ********** 

04/27/07 14:24:55 Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 15, Number of times IO throttled = 382, IO request blocks = 108

04/27/07 14:24:55 Reads = 187508, Scatter Reads = 259268, Writes = 15286, Gather Writes = 130498, Total IO Time (ms) = 5460993

04/27/07 14:24:55 DRIVE LEVEL: Sector size = 512, Cylinders = 163177, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

04/27/07 14:24:55 DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

04/27/07 14:24:55 DRIVE LEVEL: Read count = 886653, Read time = 3333820, Write count = 616951, Write time = 6606954, Idle time = 170616, Bytes read = 74679347200, Bytes written = 74403336192, Split IO Count = 19, Storage number = 4, Storage manager name = FTDISK 

04/27/07 14:24:55 ********** Final Summary for file M:\sqliosim2.mdx **********

04/27/07 14:24:55 File Attributes: Compression = No, Encryption = No, Sparse = No

04/27/07 14:24:55 Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 16, Number of times IO throttled = 762, IO request blocks = 44

04/27/07 14:24:55 Reads = 183061, Scatter Reads = 256624, Writes = 15038, Gather Writes = 132071, Total IO Time (ms) = 5890665

04/27/07 14:24:55 DRIVE LEVEL: Sector size = 512, Cylinders = 163177, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

04/27/07 14:24:55 DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

04/27/07 14:24:55 DRIVE LEVEL: Read count = 886717, Read time = 3333859, Write count = 616967, Write time = 6606962, Idle time = 170630, Bytes read = 74679609344, Bytes written = 74403405824, Split IO Count = 19, Storage number = 4, Storage manager name = FTDISK 

04/27/07 14:24:55 ********** Final Summary for file M:\sqliosim.ldx **********

04/27/07 14:24:55 File Attributes: Compression = No, Encryption = No, Sparse = No

04/27/07 14:24:55 Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 0, Number of times IO throttled = 0, IO request blocks = 8

04/27/07 14:24:55 Reads = 0, Scatter Reads = 0, Writes = 316141, Gather Writes = 0, Total IO Time (ms) = 1346919

04/27/07 14:24:55 DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

04/27/07 14:24:55 DRIVE LEVEL: Read count = 886653, Read time = 3333820, Write count = 616962, Write time = 6606959, Idle time = 170623, Bytes read = 74679347200, Bytes written = 74403381248, Split IO Count = 19, Storage number = 4, Storage manager name = FTDISK 

Stress Test Attempt Complete

Interpretation

[JWM]  This test was done against what was thought by the customer to be a well-architected disk I/O subsystem (EMC Clariion CX700, lots of 146GB 15K disks, RAID 5 (RAID5--when will  they learn?)).  The system is intended to support a system from which very high performance is required.

The results of this test belie this assumption.  I would want Average IO Duration to be less than 5ms, no more than 10ms; however both data files have values exceeding this:  15ms & 16ms, respectively.

The IO Request Blocks aren't nearly as high as I'd expect, especially compared to the 8 disk RAID 10 drive T: in Output Example #1.

The far less respectable values for M:\sqliosim2.mdx compared to the first file will prompt further investigation.

Note also that I/O was throttled 382 & 762 times, respectively.

The Average IO Duration for the log file is excellent.  I don't have enough experience interpreting output for log files, so I don't know whether to be concerned about the log number of IO Request Blocks.

 

 

Published Thursday, June 28, 2007 11:30 AM by KKline
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Linchi Shea said:

For stress testing, it is fine to use SQLIOSIM to see whether the disk subsystem may crap out under load. But here we seem to be talking about getting some type of I/O benchmarking data out of the SQLIOSIM output. For I/O benchmarks, wouldn't it be better to use a tool such as IOMeter or sqlio which allow you to control the I/O parameters better? By I/O parameters, I mean reads vs. writes, sequential vs. random, block size, queue depth, and so on.

June 30, 2007 11:16 AM
 

Jimmy May said:

I agree, Linchi.  Kevin & I corresponded several times before he posted this.  Unfortunately, the following statement did not make it into the posted version:

SQLIOSim:  What it's for—& what it's not for

SQLIOSim is a correctness and stress tool, not a performance measurement tool.  Use SQLIOSim to verify your I/O subsystem is functioning correctly under heavy loads.  If you want to measure throughput use the SQLIO utility.

-- SQL Server Storage Engine blog

July 9, 2007 7:32 PM
 

KKline said:

Great feedback, guys!  Thanks,

-Kev

July 13, 2007 1:09 PM
 

Christian Bolton's SQL Server Blog said:

Modifying the “HBA Queue Depth” is a performance tuning tip for servers that are connected to Storage

January 12, 2009 6:21 AM
 

john said:

Is this 15 second latency warning settable?   Can I alter the 15 second number?  I want to find out the worst case latency after a run.

Thanks.

February 19, 2009 9:25 AM
 

KKline said:

Good stuff, Jose.  Thanks for posting.  -Kevin

June 2, 2009 11:36 PM
 

SteveB said:

Hi Kevin.

I am running sqliosim and see something weird.

I have a 12 cpu server.

When I run sqliosim for same data and log file, but set cpucount to either 1 or 12 I see the following.

CPUCOUNT = 12

11/13/09 08:55:22 3488 Display Monitor ********** Final Summary for file C:\GPSSQL\VOLUMES\GPSSQL_DATA_1\sqldata\gpssql_master01.mdf ********** CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 517

11/13/09 08:55:22 3488 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 521

11/13/09 08:55:22 3488 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 138, Number of times IO throttled = 24519, IO request blocks = 34 CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 529

11/13/09 08:55:22 3488 Display Monitor Reads = 48527, Scatter Reads = 66002, Writes = 3593, Gather Writes = 127968, Total IO Time (ms) = 306272001 CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 540

11/13/09 08:55:22 3488 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 35539, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255 CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 562

11/13/09 08:55:22 3488 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 576

11/13/09 08:55:22 3488 Display Monitor DRIVE LEVEL: Read count = 114529, Read time = 5556694, Write count = 131581, Write time = 300678040, Idle time = 10818, Bytes read = 32390324224, Bytes written = 30083240448, Split IO Count = 0, Storage number = 1, Storage manager name = FTDISK  CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 587

11/13/09 08:55:22 3488 Display Monitor Closing file C:\GPSSQL\VOLUMES\GPSSQL_LOG\sqllog\gpssql_log_01.ldf

CPUCOUNT = 1

11/13/09 08:38:35 2608 Display Monitor ********** Final Summary for file C:\GPSSQL\VOLUMES\GPSSQL_DATA_1\sqldata\gpssql_master01.mdf ********** CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 517

11/13/09 08:38:35 2608 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 521

11/13/09 08:38:35 2608 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 0, Number of times IO throttled = 22210, IO request blocks = 741 CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 529

11/13/09 08:38:35 2608 Display Monitor Reads = 47168, Scatter Reads = 66640, Writes = 3459, Gather Writes = 130075, Total IO Time (ms) = 243048258 CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 540

11/13/09 08:38:35 2608 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 35539, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255 CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 562

11/13/09 08:38:35 2608 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 576

11/13/09 08:38:35 2608 Display Monitor DRIVE LEVEL: Read count = 113808, Read time = 3995227, Write count = 133557, Write time = 239080572, Idle time = 11942, Bytes read = 32823025664, Bytes written = 30540161536, Split IO Count = 0, Storage number = 1, Storage manager name = FTDISK  CLogicalFile::OutputSummary e:\yukon\sosbranch\sql\ntdbms\storeng\util\sqliosim\fileio.cpp 587

11/13/09 08:38:35 2608 Display Monitor Closing file C:\GPSSQL\VOLUMES\GPSSQL_LOG\sqllog\gpssql_log_01.ldf

November 13, 2009 8:09 AM
 

KKline said:

Not sure what you want me to diagnose, SteveB.  I see that the second test was shorter using 1 CPU that was the first test using 12 CPUs.  Is that what your question is about?

Honestly, not sure what might be causing this without a good understanding of what's going on with your server.  For example, if you'd scheduled these to run in the evening, a backup might've interfered with one or the other.  Also, you might want to check in with the true masters of this tool, the CSS team at Microsoft: http://blogs.msdn.com/psssql/archive/2008/03/05/how-it-works-sqliosim-checksums.aspx.

Hope this helps,

-Kevin

November 18, 2009 7:19 PM
 

SteveB said:

Thanks Kevin

This is a brand new server and nothing is installed at all. I wanted to stress the disks we mapped out to see how simulated SQL IO was like.

We created 6 * 4 disk R10  (15k) using 3 DAS arrays.

CPU is 2 * 6 Core AMD.  LUN set to WB-NORA on Data and WT-NORA on log.

The example above is:-

I am creating a data file on one of those 4 disk R10 LUN's and then testing using 1 or 12 CPUCount.

My concern is the high values returning when 12 CPU's specified.(given we have 12 CPU's, and knowing that we will have 1 thread per OS scheduler by creating multiple files .25 per core ( 1-1 for tempdb), I am concerned by what I am seeing.

Running Average IO Duration (ms) = 138, Number of times IO throttled = 24519, IO request blocks = 34

To me given we are looking for < 10ms for data this seems v bad.

Same data file with CPUCount =1  returns much better results

Running Average IO Duration (ms) = 0, Number of times IO throttled = 22210, IO request blocks = 741

I guess it would be good to know what your reading is of the output in terms of the Average IO duration,  and also what your thoughts are re the CPUCount and how that is causing huge increase in latenty?

November 19, 2009 11:39 AM
 

Jimmy May said:

@SteveB:  FYI SQLIOSim is a correctness tool, not a perf tool.  Consider using a battery of tests using SQLIO to test perf.  You must independently vary a variety of parameters.  To learn more about SQLIOSim, see this collection of references (& a tool you might enjoy):

SQLIOSim Parser by Jens Suessmeyer & Yours Truly

http://blogs.msdn.com/jimmymay/archive/2009/09/27/sqliosim-parser-by-jens-suessmeyer-yours-truly.aspx

November 19, 2009 11:48 AM
 

KKline said:

Thank you, Jimmy, you are the cavalry that came over the hill!

I was just about to direct SteveB to your excellent blog post. Especially since I can't give him a definitive answer on what to try next except that SQLIOSim isn't a true perf test tool.

-Kev

November 19, 2009 1:08 PM
 

SteveB said:

Thanks Guys

I understood this tool to offer both given the above feedback to various examples.

I understand it is a great tool to find out if there are any I/O issues (stale reads etc), BUT, based on the threads above, it also suggests it is a tool for checking performance to see if it is good or not.

As per example 3 interpretations in your entry,  it is reviewing avg duration i/o and saying that the numbers are terrible for what is expected to be an OLTP system?

I am trying to draw similar interpretations with my output.

Based on your feedback that is wrong thing to do??

I am aslo running sqlio tests to get a perf benchmark. However that is not specfically SQL Server IO, which sqliosim offers.

I am confused as to what the interpretations offer users of this tool now.

Thanks

November 21, 2009 8:23 AM
 

Jimmy May said:

@SteveB, I am sympathetic to your confusion.  Bob Dorr has published a variety of posts with guidance, but it's not as comprehensive as we'd like.  (I'm grateful that he's published as much as he has.)  I've not published an amalgamation, but Bob *has* published at least two (three?) posts in which he explicitly says so, including this one:

SQLIOSim is "NOT" an I/O Performance Tuning Tool

http://blogs.msdn.com/psssql/archive/2008/04/05/sqliosim-is-not-an-i-o-performance-tuning-tool.aspx

You're right that it's easy to be mislead into using SQLIOSim as a perf tool. A lot of folks do it, but please don't be one of them.  I *do* use it in a relative way as a high-level sanity check.  But its purpose is to stress test the integrity of your disk I/O subsystem.  To do so, your tests should last hours (or even days), & you should test a variety of parameters.  

SQLIO or IOMeter are the tools you need to use to benchmark perf.  In addition, xperf is coming of age--a tool about which I am *very* excited.

With regard to your tests, my guess is that the 12-CPU tests resulted in requests with which the heads were unable to keep up.  I'll let you know if I'm able to come up with something more compelling.

Jimmy

November 21, 2009 10:23 AM
 

KKline said:

SteveB, you might wish to try out Adam Machanic's tool called SQLQueryTest at http://www.datamanipulation.net/sqlquerystress/.  He hasn't updated it in a while, but it worked pretty well the last time I looked at it 18+ months ago.

If you have a little budget, then you should consider Benchmark Factory (full disclosure: Benchmark Factory is a product of my employer, Quest Software, though not a product from my business unit). This tools allows you to run full TPC (such as TPC-C and H, from http://www.tpc.org) benchmarks and even your own workloads scaling up to thousands of concurrent users.  You can download it at http://www.quest.com/benchmark-factory/.

If you have no budget, you can still have 2 choices with BMF - 1) use BMF for free during it's regular 15-day trial, or 2) send me a private email for an extended 6-month trial license.

Hope this helps,

-Kevin

November 23, 2009 1:51 PM
 

Kevin Kline said:

Before I jump onto the Goals and Themeword meme started by my buddy, Thomas LaRock ( blog | twitter ),

January 5, 2010 9:43 PM
 

John Langston said:

Upon seeing stalled IO warnings in the SQLIOSim output, would anyone here give the "bet your job/life" warranty that so many desire, on a prospective database server?

January 29, 2010 11:21 AM
 

Pavel said:

We are testing 3 SAN system. NetApp iSCSI, EMC CX3 and IBM DS4700.

Down are results for SQLIOSIM woth 50GB on RAID5 (data) and 20GB on RAID1(logs).

As you see, DS4700 has excelent average IO duration results with data (8ms) and logs(oms), NetApp has terrible results data(166ms) logs (9ms) and EMC2 has still bad results for data - data (121ms) and logs (0ms).

We were checking disk alignement and NetApp and EMC2 were badly configured. With correction we have much better average IO time (around half of this) but still 6-8times higher then on IBM DS4700.

NetApp is 4 years old so the results could be slower, but EMC2 is almost new. Yes, IBM has 15k diks while emc2 10k, but this could be 20-30%, not 600%!?

So what could be wrong configured on EMC that results are so bad (we run SQLIOSIM with the same file size on PC with 2SATA disks in RAID1 and we get still a little better average IO then EMC2 with alignement fixed.

Any idea what co check on EMC2?

IBM DS4700

03/19/09 21:08:04 2548 Display Monitor ********** Final Summary for file E:\Baza.mdf **********

03/19/09 21:08:04 2548 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No

03/19/09 21:08:04 2548 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 8, Number of times IO throttled = 490, IO request blocks = 4775

03/19/09 21:08:04 2548 Display Monitor Reads = 110166, Scatter Reads = 221056, Writes = 6663, Gather Writes = 264589, Total IO Time (ms) = 333331805

03/19/09 21:08:04 2548 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 26108, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

03/19/09 21:08:04 2548 Display Monitor DRIVE LEVEL: Read cache enabled = No, Write cache enabled = Yes

03/19/09 21:08:04 2548 Display Monitor DRIVE LEVEL: Read count = 0, Read time = 0, Write count = 0, Write time = 0, Idle time = 0, Bytes read = 0, Bytes written = 0, Split IO Count = 0, Storage number = 0, Storage manager name =

03/19/09 21:08:04 2548 Display Monitor Closing file F:\Log.ldf

03/19/09 21:08:04 2548 Display Monitor ********** Final Summary for file F:\Log.ldf **********

03/19/09 21:08:04 2548 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No

03/19/09 21:08:04 2548 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 0, Number of times IO throttled = 14, IO request blocks = 8

03/19/09 21:08:04 2548 Display Monitor Reads = 0, Scatter Reads = 0, Writes = 106591, Gather Writes = 0, Total IO Time (ms) = 331275

03/19/09 21:08:04 2548 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 10443, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

03/19/09 21:08:04 2548 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = Yes

03/19/09 21:08:04 2548 Display Monitor DRIVE LEVEL: Read count = 0, Read time = 0, Write count = 0, Write time = 0, Idle time = 0, Bytes read = 0, Bytes written = 0, Split IO Count = 0, Storage number = 0, Storage manager name =

03/19/09 21:08:04 2548 Display Monitor Cleaning up buffer pool

03/19/09 21:08:34 2548 Display Monitor Buffer Pool: validated buffers 856850, pages 7320537, discarded buffers 7256531

03/19/09 21:08:34 860 System Simulator Stress Test Attempt Complete

iSCSI (NetAPP)

03/19/09 10:52:45 4832 Display Monitor ********** Final Summary for file V:\sqliosim.mdx **********

03/19/09 10:52:45 4832 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No

03/19/09 10:52:45 4832 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 166, Number of times IO throttled = 27565, IO request blocks = 2

03/19/09 10:52:45 4832 Display Monitor Reads = 32491, Scatter Reads = 38136, Writes = 1995, Gather Writes = 64065, Total IO Time (ms) = 1156462388

03/19/09 10:52:45 4832 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 13056, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

03/19/09 10:52:45 4832 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

03/19/09 10:52:45 4832 Display Monitor DRIVE LEVEL: Read count = 70677, Read time = 6191643, Write count = 66353, Write time = 1150588875, Idle time = 14546, Bytes read = 17325072384, Bytes written = 15595192832, Split IO Count = 10, Storage number = 4, Storage manager name = FTDISK

03/19/09 10:52:45 4832 Display Monitor Closing file W:\sqliosim.ldx

03/19/09 10:52:45 4832 Display Monitor ********** Final Summary for file W:\sqliosim.ldx **********

03/19/09 10:52:45 4832 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No

03/19/09 10:52:45 4832 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 9, Number of times IO throttled = 19, IO request blocks = 8

03/19/09 10:52:45 4832 Display Monitor Reads = 0, Scatter Reads = 0, Writes = 31584, Gather Writes = 0, Total IO Time (ms) = 154689

03/19/09 10:52:45 4832 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 3917, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

03/19/09 10:52:45 4832 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

03/19/09 10:52:45 4832 Display Monitor DRIVE LEVEL: Read count = 0, Read time = 0, Write count = 31601, Write time = 242183, Idle time = 620082, Bytes read = 0, Bytes written = 1771209216, Split IO Count = 0, Storage number = 3, Storage manager name = FTDISK

03/19/09 10:52:45 4832 Display Monitor Cleaning up buffer pool

03/19/09 10:52:47 4832 Display Monitor Buffer Pool: validated buffers 1980958, pages 0, discarded buffers 0

03/19/09 10:52:47 5896 System Simulator Stress Test Attempt Complete

EMC2

1012 Display Monitor ********** Final Summary for file X:\sqliosim.mdx ********** CLogicalFile::OutputSummary

21:46:09 1012 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No

21:46:09 1012 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 122, Number of times IO throttled = 47606, IO request blocks = 18

21:46:09 1012 Display Monitor Reads = 80237, Scatter Reads = 187237, Writes = 4161, Gather Writes = 304754, Total IO Time (ms) = 1844102854

21:46:09 1012 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 58743, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

21:46:09 1012 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

21:46:09 1012 Display Monitor DRIVE LEVEL: Read count = 267719, Read time = 17762911, Write count = 361585, Write time = 1613870352, Idle time = 772594, Bytes read = 66358169600, Bytes written = 118023178240, Split IO Count = 0, Storage number = 3, Storage manager name = FTDISK

21:46:09 1012 Display Monitor Closing file Y:\sqliosim.ldx

21:46:09 1012 Display Monitor ********** Final Summary for file Y:\sqliosim.ldx **********

21:46:09 1012 Display Monitor File Attributes: Compression = No, Encryption = No, Sparse = No

21:46:09 1012 Display Monitor Target IO Duration (ms) = 100, Running Average IO Duration (ms) = 0, Number of times IO throttled = 0, IO request blocks = 8

21:46:09 1012 Display Monitor Reads = 0, Scatter Reads = 0, Writes = 81249, Gather Writes = 20000, Total IO Time (ms) = 578665

21:46:09 1012 Display Monitor DRIVE LEVEL: Sector size = 512, Cylinders = 13054, Media type = 12, Sectors per track = 63, Tracks per Cylinders = 255

21:46:09 1012 Display Monitor DRIVE LEVEL: Read cache enabled = Yes, Write cache enabled = No

21:46:09 1012 Display Monitor DRIVE LEVEL: Read count = 158, Read time = 130, Write count = 122149, Write time = 783832, Idle time = 2146073, Bytes read = 647168, Bytes written = 25885911040, Split IO Count = 0, Storage number = 4, Storage manager name = FTDISK

21:46:09 1012 Display Monitor Cleaning up buffer pool

21:51:06 1012 Display Monitor Buffer Pool: validated buffers 1066873, pages 6833766, discarded buffers 6775919

21:51:07 2460 System Simulator Stress Test Attempt Complete

March 9, 2010 10:34 AM
 

KKline said:

Hi Pavel,

The first thing I noticed is that the IBM has write caching enabled, while the other two arrays do not.  Write caching has a huge impact on overall IO performance, naturally, specific for any write IO operations.

You'd mentioned that the IBM had properly aligned disk sectors and that you corrected the misaligned disk sectors on the NetApp and EMC arrays.  Were there any other major subsystem differences?  For example, did they have significantly different amounts of cache than the IBM system?  If not, was the cache balanced in the same ratio between reads and writes?

Other things to check include the LUN chunk size (should be 64k on all systems), the number of spindles, and as you have already done the disk RPM speed.

Let us know what you find out.  Thanks!

-Kevin

March 25, 2010 8:29 AM
 

Jimmy May, Aspiring Geek said:

Kevin nailed the importance of write caching for SQL Server performance.

It is NOT appropriate to use SQLIOSim as a performance tool.  Here are posts from the author of the tool explaining why:

SQLIOSim is "NOT" an I/O Performance Tuning Tool

http://blogs.msdn.com/psssql/archive/2008/04/05/sqliosim-is-not-an-i-o-performance-tuning-tool.aspx

This blog post is to again inform you "NOT" to consider SQLIOSim as an I/O performance evaluation tool!   Every other month or so I get a rash of questions about using SQLIOSim to validate I/O performance and I have to repeat the same information.

Don't consider SQLIOSim for performance benchmarking...

http://www.microsoft.com/downloads/details.aspx?familyid=9A8B005B-84E4-4F24-8D65-CB53442D9E19

It has been my experience that NetApp systems are not compatible with high performance & scalability (especially for active databases requiring maintenance leveraging NetApp NetApp snapshots).  I'm open to learning of tests which contradict my experience.

March 27, 2010 6:38 PM
 

KKline said:

Disks may indeed cause corruptions, but you'll rarely use SQLIOSIM to determine that.  Here's an article where the writer successfully used SQLIOSIM for that purpose: http://sqlserverpedia.com/wiki/Using_SQLIOSIM_to_isolate_DB_corruption.  

Typically, however, you will look for errors in the SQL Server (and very likely the Windows) Error Logs for disk IO errors and corruption errors.  For troubleshooting, you'll use DMVs and possibly PerfMon counters.

June 4, 2010 4:11 PM
 

Anonymous said:

It is very nice article with lots of infomation. Also it's very clear that SQLIOSIM is used to measure the new server trace and not the performance.

Now here my question is if I am getting "Running Average IO Duration (ms) = 138" compare to ideal 5-6 ms. How to bring down the "Running Average IO Duration"? What are the steps we can take to improve it?

As many of the above posted results also show Running Average IO Duration" > "Target IO Duration"

Any luck?

July 1, 2010 6:47 AM
 

frant101 said:

Although SQLIOSIM is running and return results it is also returning the error

Unable to get disk cache info for C:\

What can i do to resolve this

Thanks

Frant101

October 25, 2010 11:04 AM
 

BigDaddy said:

All,

Another note if you are looking at Latency would be to run direct to the array vs switch attached.  If you are running off the switch you are adding latency. JWM keeps looking for <5ms than I would direct attach you connections.  In most cases the switch will add that if not more latency.  Also look at your Multipathing software.  What algorithm are you using?  Round Robin is dead when matters of performance are an issue.  How many HBA's and how many LUNs are you communicating to?  What is the Queue depth on the HBA's?

Best wishes and remember design is never host only.

February 10, 2011 2:30 PM
 

krish said:

Hey Kevin,

Can I use SQLIOSIm to test data integrity? I have been reading around and figured that it is not the best tool to test data integrity. Is it true? If so, is there a way I can check integrity of the data during a long duration Stress test?

July 1, 2011 3:03 AM
 

KKline said:

That's correct, Krish. SQLIOSim does not create real data.  It simulates a SQL Server workload, but not real transactions inside of SQL Server.

July 1, 2011 2:03 PM
 

krish said:

Thanks for your response, Kevin. Can you think of any recommendations/tools to perform data integrity checks in a SQL db?

July 2, 2011 3:14 AM
 

Kevin Kline said:

Remote Desktop Services Component Architecture Poster Grab your own poster! A visual guide to key Remote

July 20, 2011 10:47 AM
 

ML said:

Hi Kevin,

I was wondering if you can help me claify: Number of times IO throttled indicator.

The description you have is: Number of times a request was removed because the duration was exceeded

I would like to know where this duration number is coming from and if there was a way for me to tweak the duration?  Is it somewhere in the config files that came w/ sqliosim?

Thanks.

September 21, 2011 5:54 PM
 

KKline said:

Hi ML,

I take it you saw the formula for "Number of times IO throttled" indicator?  It is   if (m_dwRunningAvgDuration > TargetIODuration && m_lRequestNodes > MinIOs) //MinIos is 1   .  

Make sure to read this one too: http://blogs.msdn.com/b/psssql/archive/2008/11/12/how-it-works-sqliosim-running-average-target-duration-discarded-buffers.aspx.

I hope this helps,

-Kevin

September 27, 2011 11:35 AM
 

Callum said:

Hi Kevin,

Would you use the results from SQLIOSim or SQL IO for tuning HBA Queue Depths ?

Cheers

Callum

September 29, 2011 7:49 AM
 

KKline said:

Hi Callum,

As Jimmy May of SQLCAT says earlier, SQLIOSIM should not be considered an IO performance tuning tool.  Read more at: http://blogs.msdn.com/psssql/archive/2008/04/05/sqliosim-is-not-an-i-o-performance-tuning-tool.aspx.

If it's not an IO performance tuning tool, then what is it?  It is an IO STRESS TEST tool, meaning it is intended to show where data problems might come up, not to show total throughput or performance.

October 5, 2011 5:07 PM
 

Orion Pax said:

Hi, we recently purchased a new SQL server and have two disk configurations in mind. Obviously the only way to determine which one would be better is to perform the same stress/load test on both configurations. Would SQLIOSim be the right tool for this?

April 24, 2012 3:47 PM
 

mk said:

hi, kevin

understanding  SQLIOSIm is an IO STRESS TEST tool, meaning it is intended to show where data problems might come up.

so if there is some data problem, the tool will indicate the problem in the result ? or we need to check the event log or perf counter for SQL error ourself ?

June 3, 2012 11:48 PM
 

KKline said:

Hi MK,

SQLIOSim is only a benchmarking tool.  You'll have to use the built-in instrumentation to determine if there are errors occurring, such as the Windows Error and Application Logs.

June 4, 2012 1:49 PM
 

EdCarden said:

Am I the only one not able to find SQLIO?  ALl of the links to download SQLIO from Microosft return a "The Page You are looking for is gone" msg and I can't find any info anywhere indicating that the SQLIO tool had been puled from use or what but its clear that everyone expects it to be at Microsofts website but its not there.

Kevin, do you know anything about what happened to SQLIO?  I sure do need to use it.

August 16, 2012 6:22 PM
 

Jimmy May @aspiringgeek said:

Ed, follow @aspiringgeek & I'll contact you.

August 16, 2012 8:53 PM
 

Jimmy D said:

Any ideas what would cause  Number of times IO throttled = 0 on all drives? The IO Request Blocks are all around 45 or so, and the Average IO Request duration is 1-5ms, but IO throttled repeatedly comes back as all 0's.

April 24, 2014 11:43 AM

Leave a Comment

(required) 
(required) 
Submit

About KKline

Kevin Kline is a well-known database industry expert, author, and speaker. Kevin is a long-time Microsoft MVP and was one of the founders of PASS, www.sqlpass.org.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement