|
|
|
|
-
I only recently discovered that SSMS will connect to different things. For instance, press the "New query" button. What were you connected to? The answer is the same server as your "current" server. But what is the current server? It is the server where you happened to have focus when the pressed the "New query" button. So, can you say whether you had focus in a query window, Object Exporer or Registered Servers?
This also applies to when you double-click a .sql file. And it doesn't stop there. Open the "Registered Servers" window. Now, click on a server group. Go to explorer and double-click a .sql file. What were you connected to? Yes, all the servers in that group. Now, don't get me wrong here; the ability to open the same query window against several servers can be a very useful thing. What I had no idea until just about now is how easily thsi can happen by mistake. Just by cklicking the New Query window, or even double-clicking an .sql file. So - be aware...
(FYI: SSMS 2005 doesn't seem to do this for clicking a file in explorer, and the functionality to have a query window against several server didn't exist in SSMS 2005...)
|
-
The story usually goes something like:
Q - How can I restore only this table? A - Put it on its own filegroup and you can do filegroup level backup and restore.
The problem with above answer is that it most likely misses the point. We need to ask ourselves: Why do you want to do a table level restore?
The answer to the question is very often that the table need to be reverted to an earlier point in time, possibly because some accident happened; like deleting all rows in the table by mistake. (See my minimizing data loss when accidents happens article for a more general discussion.) So, why is not filegroup backup that usable for this scenario?
SQL Server will not let you into a database where different data is from different points in time! (2005+, Enterprise Edition and Developer Edition, has online restore which allow you into the database but you wont be able to access the restored data until you make it current - so it doesn't really changes the basic issue here.)
Now, think about above. If we restore the filegroup backup containing the emptied table, but then need to restore all subsequent log backups up to "now", what good did this song-and-dance-act do us? No good at all (except for a learning experience, of course). We can of course restore the primary filegroup and the one with the damaged data into a new temp database - to the desired earlier point in time, and then copy the relevant data from this temp database into the production database. But this operation is certainly not as straight forward as just restoring the filegroup backup into the production/source database.
Now, about having data from different point in time (regardless of how you achieve it): Handle with care. Just think about relationship and dependencies you have inside a database. Reverting some table to an earlier point in time can cause havoc for those dependencies.
I won't get into details about how filegroup backups work, online restore, the PARTIAL option of the restore command etc. - you can read about all that in Books Online. The point about this blog is to have somewhere I can point to when I see the "put-the-table-on-its-own-filegroup-and-backup-that-filegroup" recommendation.
As usual, I have a TSQL script to display my points. If you happen to think that it *is* possible to restore part of the database to an earlier point in time into the production/source database - I'm all ears. You can post a comment here, I will be notified. Please use below script as a template, and modify so that we can execute it and re-execute it. The usual disclaimer is to not execute below if you don't understand what it is doing, etc.
--Drop and create the database USE master IF DB_ID('fgr') IS NOT NULL DROP DATABASE fgr GO --Three filegroups CREATE DATABASE fgr ON PRIMARY ( NAME = N'fgr', FILENAME = 'C:\fgr.mdf'), FILEGROUP fg1 ( NAME = N'fg1', FILENAME = 'C:\fg1.ndf'), FILEGROUP fg2 ( NAME = N'fg2', FILENAME = 'C:\fg2.ndf') LOG ON ( NAME = N'fgr_log', FILENAME = 'C:\fgr_log.ldf') GO ALTER DATABASE fgr SET RECOVERY FULL
--Base backup BACKUP DATABASE fgr TO DISK = 'C:\fgr.bak' WITH INIT GO
--One table on each filegroup CREATE TABLE fgr..t_primary(c1 INT) ON "PRIMARY" CREATE TABLE fgr..t_fg1(c1 INT) ON fg1 CREATE TABLE fgr..t_fg2(c1 INT) ON fg2
--Insert data into each table INSERT INTO fgr..t_primary(c1) VALUES(1) INSERT INTO fgr..t_fg1(c1) VALUES(1) INSERT INTO fgr..t_fg2(c1) VALUES(1)
BACKUP LOG fgr TO DISK = 'c:\fgr.trn' WITH INIT --1
--Filegroup backup of fg2 BACKUP DATABASE fgr FILEGROUP = 'fg2' TO DISK = 'C:\fgr_fg2.bak' WITH INIT
BACKUP LOG fgr TO DISK = 'c:\fgr.trn' WITH NOINIT --2
--Delete from t_fg2 --Ths is our accident which we want to rollback!!! DELETE FROM fgr..t_fg2
BACKUP LOG fgr TO DISK = 'c:\fgr.trn' WITH NOINIT --3
--Now, try to restore that filegroup to previos point in time RESTORE DATABASE fgr FILEGROUP = 'fg2' FROM DISK = 'C:\fgr_fg2.bak' GO
SELECT * FROM fgr..t_fg2 --error 8653 GO
--If we are on 2005+ and EE or Dev Ed, the restore can be online --This means that rest of the database is accessible during the restore INSERT INTO fgr..t_fg1(c1) VALUES(2) SELECT * FROM fgr..t_fg1
--We must restore *all* log backups since that db backup RESTORE LOG fgr FROM DISK = 'c:\fgr.trn' WITH FILE = 2 --out of 3 RESTORE LOG fgr FROM DISK = 'c:\fgr.trn' WITH FILE = 3 --out of 3 GO
SELECT * FROM fgr..t_fg2 --Success --We didn't get to the data before the accidental DELETE! GO
---------------------------------------------------------------------------- --What we can do is restore into a new database instead, --to an earlier point in time. --We need the PRIMARY filegroup and whatever more we want to access ---------------------------------------------------------------------------- IF DB_ID('fgr_tmp') IS NOT NULL DROP DATABASE fgr_tmp GO RESTORE DATABASE fgr_tmp FILEGROUP = 'PRIMARY' FROM DISK = 'C:\fgr.bak' WITH MOVE 'fgr' TO 'C:\fgr_tmp.mdf' ,MOVE 'fg2' TO 'C:\fg2_tmp.ndf' ,MOVE 'fgr_log' TO 'C:\fgr_tmp_log.ldf' ,PARTIAL, NORECOVERY
RESTORE DATABASE fgr_tmp FILEGROUP = 'fg2' FROM DISK = 'C:\fgr_fg2.bak'
RESTORE LOG fgr_tmp FROM DISK = 'c:\fgr.trn' WITH FILE = 1, NORECOVERY RESTORE LOG fgr_tmp FROM DISK = 'c:\fgr.trn' WITH FILE = 2, RECOVERY
--Now the data in PRIMARY and fg2 is accessible SELECT * FROM fgr_tmp..t_fg2
--We can use above to import to our production db: INSERT INTO fgr..t_fg2(c1) SELECT c1 FROM fgr_tmp..t_fg2
--And now the data is there again :-) SELECT * FROM fgr..t_fg2
|
-
The old sysindexes table (as of 2005 implemented as a compatibility view) has a useful column named keycnt. This is supposed to give us the number of columns (keys) in the index. However, to make heads and tails out of the numbers, we need to understand how a non-clustered index is constructed. For a heap, the pointer to a row is the physical file/page/row address (aka "rowid"). This is counted as a key in the keycnt column:
IF OBJECT_ID('t1') IS NOT NULL DROP TABLE t1 GO CREATE TABLE T1 (c1 INT, c2 datetime, c3 VARCHAR(3)) CREATE INDEX ix_T1_c1 ON T1 (c1) CREATE INDEX ix_T1_c1_c2 ON T1 (c1, c2) CREATE INDEX ix_T1_c1_c2_c3 ON T1 (c1, c2, c3) CREATE INDEX ix_T1_c2 ON T1 (c2) SELECT name, keycnt, indid, id FROM sys.sysindexes WHERE id = OBJECT_ID('T1')
For the index on column (c2), you see a keycnt of 2. This is the key in the index plus the rowid.
For a nonclustered index on a clustered table, the row locator is the clustering key. Note, though, that if the clustered index is not defined as unique (PK, UQ etc), then another "uniqueifier" key/column is added. Building on above example:
CREATE UNIQUE CLUSTERED INDEX ix_T1_c1 ON T1 (c1) WITH DROP_EXISTING SELECT name, keycnt, indid, id FROM sys.sysindexes WHERE id = OBJECT_ID('T1') GO CREATE CLUSTERED INDEX ix_T1_c1 ON T1 (c1) WITH DROP_EXISTING SELECT name, keycnt, indid, id FROM sys.sysindexes WHERE id = OBJECT_ID('T1')
Consider the (non-clustered) index on column c2. For the first one, where the table has a unique clustered index, we see a keycnt of 2, the column c2 plus the clustered key. But when we define the clustered index as non-unique, we see +1 for the keycnt column; the uniqueifier. The uniqueifier is 4 byte, and only populated for rows which are duplicates of an existing clustered key (i.e. no extra cost if no duplicates).
But we want to stay away from compatibility views, right? Since we no longer have a key count column in sys.indexes, we need to grab that from sys.index_columns. This do not, however include the internal columns, only the explicitly defined columns: SELECT i.name ,i.index_id ,(SELECT COUNT(*) FROM sys.index_columns AS ic WHERE i.OBJECT_ID = ic.OBJECT_ID AND i.index_id = ic.index_id) AS keycnt FROM sys.indexes AS i WHERE OBJECT_ID = OBJECT_ID('T1')
|
-
Say you happened to get too many transaction log (ldf) files. Can you remove log files from the database? Yes, but only if a file isn't in use, and you cannot remove the first ("primary") log file.
So, be prepared to investigate the virtual file layout, using DBCC LOGINFO, to see if a log file is in use or not. You can find information about how to investigate the virtual log file layout in my shrink article. The basic steps are a bit similar to shrinking a log file: Investigate virtual log file layout, backup log, possibly shrink file, try removing it. Do this again as many times as it takes (repeat, rinse and lather).
Below is a script that, if you take the time to study it and play with it, will prepare you to remove transaction log files from a database. As always, don't execute it if you don't understand what it does! USE master IF DB_ID('rDb') IS NOT NULL DROP DATABASE rDb GO
CREATE DATABASE rDb ON PRIMARY ( NAME = N'rDb', FILENAME = N'C:\rDb.mdf' , SIZE = 50MB , FILEGROWTH = 1024KB ) LOG ON (NAME = N'rDb_log2', FILENAME = N'C:\rDb_log2.ldf', SIZE = 3MB, FILEGROWTH = 2MB) ,(NAME = N'rDb_log3', FILENAME = N'C:\rDb_log3.ldf', SIZE = 3MB, FILEGROWTH = 2MB) ,(NAME = N'rDb_log4', FILENAME = N'C:\rDb_log4.ldf', SIZE = 3MB, FILEGROWTH = 2MB) GO
ALTER DATABASE rDb SET RECOVERY FULL BACKUP DATABASE rDb TO DISK = 'C:\rDb.bak' WITH INIT CREATE TABLE rDb..t(c1 INT IDENTITY, c2 CHAR(100))
INSERT INTO rDb..t SELECT TOP(15000) 'hello' FROM syscolumns AS a CROSS JOIN syscolumns AS b
--Log is now about 46% full DBCC SQLPERF(logspace)
--Check virtual log file layout DBCC LOGINFO(rDb) --See that file 4 isn't used at all (Status = 0 for all 4's rows)
--We can remove file 4, it isn't used ALTER DATABASE rDb REMOVE FILE rDb_log4
--Check virtual log file layout DBCC LOGINFO(rDb)
--Can't remove 3 since it is in use ALTER DATABASE rDb REMOVE FILE rDb_log3
--What if we backup log? BACKUP LOG rDb TO DISK = 'C:\rDb.bak'
--Check virtual log file layout DBCC LOGINFO(rDb) --3 is still in use (status = 2)
--Can't remove 3 since it is in use ALTER DATABASE rDb REMOVE FILE rDb_log3
--Shrink 3 USE rDb DBCC SHRINKFILE(rDb_log3) USE master
--... and backup log? BACKUP LOG rDb TO DISK = 'C:\rDb.bak'
--Check virtual log file layout DBCC LOGINFO(rDb) --3 is no longer in use
--Can now remove 3 since it is not in use ALTER DATABASE rDb REMOVE FILE rDb_log3
--Check explorer, we're down to 1 log file
--See what sys.database_files say? SELECT * FROM rDb.sys.database_files --Seems physical file is gone, but SQL Server consider the file offline
--Backup log does it: BACKUP LOG rDb TO DISK = 'C:\rDb.bak' SELECT * FROM rDb.sys.database_files
--Can never remove the first ("primary") log file ALTER DATABASE rDb REMOVE FILE rDb_log2 --Note error message from above
|
-
I see a trend towards more and more focusing on response time; and less and less on resource usage (resource consumption). I've even seen blanket statements such as the only thing that matters is response time. I do not agree. I feel that by being a good citizen and consume as few resources and possible, we contribute to the overall welfare of the system.
For instance, I'm fortunate to have some 8 km (5 miles) to my main client. I can take the car, which often is about 15 minutes or I can bicycle, which is about 20 minutes. For many reasons, I prefer to bicycle. The thing here is that I compromise a little bit and accept a few more minutes when going by bicycle, but I feel I'm a better citizen and contribute to a more sustainable society. But not only that: ever so often, the traffic is congested, and now the car takes some 40-45 minutes (bicycle still 20 minutes). By using the bicycle I both consume less resources and I also have a higher degree of predictability. Now, is this analogy appropriate to database performance? I don't know, perhaps to some extent... But let me give you a database example, from real life, followed by a TSQL example:
I have a client who had this query which used to be quick (enough) and suddenly was very slow. Been there before, we know this can be just about anything. Anyhow, it was pretty quick for me to find the reason. The query had an ORDER BY and a FAST hint. The FAST hint tells SQL Server to return rows to the client as fast as possible, but possibly with a higher overall cost. The developer who added that FAST hint didn't really think that hard about it, and just "threw it in there". It sounds good, doesn't it? There was a non-clustered index (non-covering) on the sort column and also some restrictions (WHERE clause).
With the FAST hint, SQL Server used the index on the sort column to drive the query and for each row it did a "bookmark lookup" to fetch the row. This means a page access for each row, but rows can be returned to the client application immediately (think streaming). Without the fast hint, SQL Server first sorted the relevant rows into a worktable and then returned the rows in sorted order from that worktable. So we have a tradeoff between reading a lot of pages (possibly some from cache) or doing some work up-front to sort data and then just read that sorted worktable sequentially.
The worrying part here is that with a small table, it will fit in cache and the difference between the two might not be that drastic (either way). But as table grew larger, it won't fit in cache anymore and as we see logical I/O turning into physical I/O things really go south for the query with the FAST hint. This is what happened to my client. Table grew and a query which had OK response time suddenly was a disaster. If that FAST hint wasn't there in the first place, my client wouldn't have this slowness in the application over the two weeks it took until I had time to look over it and remove the FAST hint (I also added a couple of indexes, but that is beside the point).
Seeing is believing, right? At the end of this blog post, you will find TSQL that pretty much mimics my client's case. It populates a table with 800,000 rows and there's a non-clustered index on the sort column. We then try some variations to check response time, CPU seconds usage, I/O and duration. I measured response time using TSQL (as seen in the script). I also measured response time and the other metrics using Profiler.
The size of the table (clustered index on identity column) is 133MB and the non-clustered index to on the sort column is 11MB. This is a small table, but that makes things more manageable; and by setting the max server memory to a low value (60MB), we can still see the effect of logical vs. physical I/O.
We first run the query and have a filter that restricts to 4,000 rows out of 800,000 rows. Note that there's no index on the filter column.
- The query without a FAST hint was very consistent. The response time was either 0.05 seconds (without clearing cache first) or 1.9 seconds (if we clear cache first). This was regardless of if we configured with 500MB or 50MB memory for sp_configure 'max server memory'.
- The query with FAST hint was OK with memory setting of 500MB, so the table would fit in cache: 1.6 seconds to 4.4 seconds (depending on whether we empty cache before execution). But when we lower memory setting (think "large table"), the execution time jumped up to 73 seconds. That is a factor of between 48 and 1460.
Things got a bit different when we removed the WHERE clause to return all rows:
- Query without FAST hint took between 10 seconds and 23 seconds (depending on whether we empty cache first) for a memory setting of 500MB. Lowering memory to 60MB made this one take between 23 and 41 seconds. Note that I here got some error messages from SQL Server regarding insufficient memory in the internal memory pool (possibly SQL Server now did some fall-back strategy for the query execution, which added to execution time).
- The query with the FAST hint outperformed the one without for a memory setting of 500MB, with execution time between 2.2 and 5.6 seconds. Note that I configured SSMS to discard results so there is no processing of the returned 800,000 rows included here. With a memory setting of 60MB, we again bumped up execution time to some 74 seconds.
Here are the full numbers:
| ms |
ms Profiler |
ms cpu |
io |
fast hint |
memconf |
cache clean |
rows returned |
| 1930 |
2023 |
202 |
18782 |
0 |
500 |
1 |
4000 |
| 53 |
60 |
110 |
18768 |
0 |
500 |
0 |
4000 |
| 4403 |
4497 |
2075 |
2695310 |
1 |
500 |
1 |
4000 |
| 1616 |
1622 |
1622 |
2551439 |
1 |
500 |
0 |
4000 |
| 1930 |
1977 |
171 |
18768 |
0 |
60 |
1 |
4000 |
| 56 |
59 |
94 |
18768 |
0 |
60 |
0 |
4000 |
| 72426 |
72479 |
10888 |
5513944 |
1 |
60 |
1 |
4000 |
| 72663 |
72728 |
10983 |
5521626 |
1 |
60 |
0 |
4000 |
| 23336 |
23391 |
2105 |
31738 |
0 |
500 |
1 |
800000 |
| 10263 |
10269 |
2559 |
31574 |
0 |
500 |
0 |
800000 |
| 5663 |
5703 |
2386 |
2695368 |
1 |
500 |
1 |
800000 |
| 2226 |
2235 |
2028 |
2551439 |
1 |
500 |
0 |
800000 |
| 40966 |
40975 |
2620 |
31654 |
0 |
60 |
1 |
800000 |
| 22906 |
22913 |
2714 |
31629 |
0 |
60 |
0 |
800000 |
| 73676 |
73709 |
11045 |
5512080 |
1 |
60 |
1 |
800000 |
| 74513 |
74557 |
11778 |
5522556 |
1 |
60 |
0 |
800000 |
For the sake of completeness, I should add that having a good supporting index for the restriction (for the queries that had a restriction) made the query go equally fast regardless of memory config or FAST hint (in fact the FAST hint was irrelevant with a good index).
Here's the T-SQL if you want to play with it. As always, don't execute anything if you don't understand the code and the consequences of executing it! EXEC sp_configure 'max server memory', 500 RECONFIGURE GO
USE master GO IF DB_ID('TestDb') IS NOT NULL DROP DATABASE TestDb GO CREATE DATABASE [TestDb] ON PRIMARY (NAME = N'TDb', FILENAME = N'C:\TDb.mdf' ,SIZE= 100MB, MAXSIZE = 200MB, FILEGROWTH = 30MB ) LOG ON (NAME = N'TDb_l', FILENAME = N'C:\TDb_l.ldf' ,SIZE = 200MB, MAXSIZE = 500MB, FILEGROWTH = 20MB ) GO
USE testDb
CREATE TABLE t(c1 INT IDENTITY PRIMARY KEY CLUSTERED, c2 INT, c3 INT, filler CHAR(150))
INSERT INTO t (c2, c3, filler) SELECT TOP(800000) 1, 1, 'hi' FROM sys.columns AS a CROSS JOIN sys.columns AS b CROSS JOIN sys.columns AS c
UPDATE t SET c2 = c1 % 20, c3 = c1 % 200
CREATE NONCLUSTERED INDEX x ON t(c2)
--Size of table and indexes EXEC sp_indexinfo t --Can be found at http://www.karaszi.com/SQLServer/util_sp_indexinfo.asp
IF OBJECT_ID('tmp') IS NOT NULL DROP TABLE tmp GO CREATE TABLE tmp (seq tinyint IDENTITY(1,1) PRIMARY KEY NOT NULL ,ms INT NOT NULL ,ms_profiler INT NULL ,ms_cpu INT NULL ,io_ INT NULL ,fast_hint bit NOT NULL ,memconf smallint NOT NULL ,cache_clean bit NOT NULL ,rows_returned INT NOT NULL) GO
----------------------------------------------------------------------------------------- --Create procedures IF OBJECT_ID('emptyCache') IS NOT NULL DROP PROC emptyCache GO CREATE PROC emptyCache AS BEGIN CHECKPOINT DBCC DROPCLEANBUFFERS END GO
IF OBJECT_ID('do_it') IS NOT NULL DROP PROC do_it GO CREATE PROC do_it @fast_hint bit, @memconf smallint, @cacheclean bit, @rows_returned INT WITH RECOMPILE AS BEGIN DECLARE @dt datetime SET @dt = GETDATE() IF @fast_hint = CAST(0 AS bit) IF @rows_returned = 4000 SELECT * FROM t WHERE c3 = 16 ORDER BY c2 ELSE --return all rows SELECT * FROM t ORDER BY c2 ELSE --add FAST hint IF @rows_returned = 4000 SELECT * FROM t WHERE c3 = 16 ORDER BY c2 OPTION(FAST 20) ELSE --return all rows SELECT * FROM t ORDER BY c2 OPTION(FAST 20)
INSERT INTO tmp(ms, fast_hint, memconf, cache_clean, rows_returned) VALUES(DATEDIFF(ms, @dt, GETDATE()), @fast_hint, @memconf, @cacheclean, @rows_returned) END GO
TRUNCATE TABLE tmp ----------------------------------------------------------------------------------------- --Return 4000 rows -----------------------------------------------------------------------------------------
--500 MB memory EXEC sp_configure 'max server memory', 500 RECONFIGURE GO --Without FAST EXEC emptyCache GO EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 1, @rows_returned = 4000 GO EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 0, @rows_returned = 4000 GO --... with FAST EXEC emptyCache GO EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 1, @rows_returned = 4000 GO EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 0, @rows_returned = 4000 GO
--50 MB memory EXEC sp_configure 'max server memory', 60 RECONFIGURE GO --Without FAST EXEC emptyCache GO EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 1, @rows_returned = 4000 GO EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 0, @rows_returned = 4000 GO --... with FAST EXEC emptyCache GO EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 1, @rows_returned = 4000 GO EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 0, @rows_returned = 4000 GO
------------------------------------------------------------------------------------ --Return all 800,000 rows ------------------------------------------------------------------------------------
--500 MB memory EXEC sp_configure 'max server memory', 500 RECONFIGURE GO --Without FAST EXEC emptyCache GO EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 1, @rows_returned = 800000 GO EXEC do_it @fast_hint = 0, @memconf = 500, @cacheclean = 0, @rows_returned = 800000 GO --... with FAST EXEC emptyCache GO EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 1, @rows_returneed = 800000 GO EXEC do_it @fast_hint = 1, @memconf = 500, @cacheclean = 0, @rows_returned = 800000 GO
--50 MB memory EXEC sp_configure 'max server memory', 60 RECONFIGURE GO --Without FAST EXEC emptyCache GO EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 1, @rows_returned = 800000 GO EXEC do_it @fast_hint = 0, @memconf = 60, @cacheclean = 0, @rows_returned = 800000 GO --... with FAST EXEC emptyCache GO EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 1, @rows_returned = 800000 GO EXEC do_it @fast_hint = 1, @memconf = 60, @cacheclean = 0, @rows_returned = 800000 GO
|
-
-
Just a quick note that we again can modify whether system messages are to go to eventlog/errorlog again. I.e., we can change the is_event_logged column in sys.messages. This is very valuable in general and specifically is you want to define Agent alerts (for which Agent polls the Eventlog). For instance:
SELECT * FROM sys.messages WHERE message_id = 1205 AND language_id = 1033
Notice the value for the is_event_logged column. Now, run below:
EXEC sp_altermessage @message_id = 1205 ,@parameter = 'WITH_LOG' ,@parameter_value = 'true'
Now, re-run the select statement and see that you modified the behavior for the system message. Now, re-run the sp_altermessage with 'false' to reset to default.
The ability to modify this behavior for system messages was available prior to SQL Server 2005, but some re-architecturing in 2005 removed the functionality. kozloski informed me in this blog post that 2005 sp3 re-introduced the functionality and obviously as of sp1 the functionlity is back in 2008 as well.
|
-
This is a recommendation I believe is worth repeating from time to time: Make sure you match data types when you write TSQL code. Else you in most cases end up with an implicit data type conversion. And in worst case, this conversion is performed at the column side - not the literal side of your query. What does that mean? Consider below:
WHERE Col = Val
Now, say that the types for above don't match. Val might be some parameter (to a stored procedure, for instance), a variable or a written value (literal). In any case, when SQL Server need to do some operation )like comparison like here) involving several values which aren't of the same type, then one of the values need to be converted so it is of the same type as the other. Which one is determined by the rules for "Data Type Precedence".
What we don't want is a conversion at the column side. Why? I would argue that an implicit conversion in the first place in many cases mean I misunderstood something about the types in my system and am on my way of producing a bug in my code. But having a more strict language is not the topic for this particular post (check out this). My point here is that it is bad for performance reasons. Just yesterday and today I was involved in a thread on the MSDN forum. Here's the repro script from that thread (slightly adjusted by me):
USE tempdb GO IF OBJECT_ID('t') IS NOT NULL DROP TABLE t GO CREATE TABLE t ( c1 int IDENTITY(10000001,1) NOT NULL ,c2 char(8) NULL ,c3 datetime NULL ) GO INSERT INTO t(c3) SELECT TOP 3000000 '20080203' FROM master..spt_values t1 CROSS JOIN master..spt_values t2 CROSS JOIN master..spt_values t3 UPDATE t SET c2 = CAST(c1 AS char(8)) CREATE UNIQUE CLUSTERED INDEX x ON t(c2) GO
SELECT * FROM t WHERE c2 = N'10000009' SELECT * FROM t WHERE c2 = N'10000008'
Don't bother too much about the first section. The relevant part here is that we have char(8) column with values which are unique (enforced by a unique index), on a table with three million rows.
Now, before we dig deeper: What is the problem with above queries? The type do not match! The type in the table is char(8), but literal is Unicode (the N before the string). Since nchar has higher precedence than char, then the column side needs to be converted to the value side. Not good. Try the code on SQL Server 2000 and you will see disaster. For me, STATISTICS IO gave 10755 logical reads for both queries, with Clustered Index Scan (which is the same as table scan). Hoover on the Clustered Index Scan icon and you will find "WHERE:(Convert([t].[c2])=[@1])". See how the conversion is done at the column side? This, my friend, mean that SQL Server cannot seek an index to find the rows. Ouch! That's just because we were a bit sloppy... Compare to below: SELECT * FROM t WHERE c2 = '10000009' SELECT * FROM t WHERE c2 = '10000008'
You will find for both two above we have Clustered Index Seek with only three logical reads. That is a good improvement.
OK, enough about SQL Server 2000, lets discuss 2008/2005. At first I couldn't see any difference between the two original queries (N'10000009' and N'10000008'). Both seemed to be instant, and the same execution plan for both. I even saved the execution plan as XML and used WinDiff to compare the xml files. No difference. But the poster said the "9" query was slower than the "8" query. I executed to text and had ctrl-E pressed for a while. And now I could see a difference. The flicker was quicker for the "8" version vs. the "9" version. I used Profiler and confirmed. The "9" version had consistently about 90000 microsecond duration where the "8" version had about 0 duration. This was also reflected in CPU usage, as shown by Profiler. I hoped to use Extended Events to find something deeper about this, but that didn't give me anything. Perhaps I didn't capture the relevant events, quite simply. X/Event experts are welcome to suggest event session configurations. :-)
My guess for the difference is in the execution plan. You will see an index seek for both. That is good, since SQL Server nowadays potentially can use an index even if you end up with an implicit conversion at the column side. But you will see a few more operators in addition to the index seek. See that Constant Scalar followed by a Compute Scalar? My guess is that the built-in internal function used inside the Compute Scalar operator is more expensive for N'10000009' compared to N'10000008'. You can see this function if you use SET STATISTICS PROFILE ON. See the GetRangeThroughConvert() function. My guess is that the answer is the cost for this function. (Search the Internet for things like "Constant Scan" and "GetRangeThroughConvert" for other blog post on this topic.)
The moral of the story? Here we have a cost which can be relevant in some cases, but we will only see this if we happen to execute our query with the right values and are very very observant. But we can eliminate all this mess this by using the right type for the column in the first place!
|
-
I do quite a bit of posting on forums, and I have noticed that in many cases when you take your time to explain how something work, or why you should do something perhaps in a not so obvious way, that answer tend to be ignored and never replied to.
Instead the original poster replies to the short answer which perhaps ignored things like best practices, performance, readability etc. And, very often I see endless discussion based on that reply, where the original poster never gets a satisfactory solution or even perhaps a working solution. And there I posted an elaboration which held the answer days ago! I also explained why solution X doesn't work and solution Y only work in some circumstances etc.
In many cases, we have taken the time to write articles or blog posts where we have predicted most misunderstandings and explain why you should do it this or that way and what drawbacks workaround A has, and also what drawbacks workaround B has. So, we post the URL to the article only to find that it apparently never was read because the following discussions in the thread goes on for days and waste time to elaborate on things that could be understood by spending some 10 - 20 minutes to actually read the article or blog post we pointed to.
I find this state mildly depressing. Have we reached the sad state where we don't tend to be interested in why something works, how it works, or even understand what we are actually doing? Quick fixes, anyone? Only 10 cents each!
Is this something new? I've been on forums for > 10 years now, let me try to think..... I believe that tendency has always been there, but perhaps it has been getting worse with time. What are your experiences? Am I only having a bad day?
|
-
You might ask yourself: What is Instant File Initialization and why does it matter?
What Instant File Initialization is: Instant File Initialization allow SQL Server to allocate storage (space from disks) very very quickly. As you probably know, when you delete files they are not physically deleted from the disk - they are only marked as deleted in the file system allocation structures. This is why various undelete programs can recover deleted files. So imagine a process requiring disk space. The disk area given to the process from the file system and operating system could potentially contains some secret information from deleted files. This is why the file system/OS insist to zero out the storage before the process can actually see it. (I'm not OS / file system expert so anyone is free to correct me here.) That is, unless the one allocating the storage has some special privileges.
When does the zeroing out of disk space occur? Whenever SQL Server need disk space, such as:
- Create database
- Add file to database
- Restore (if the restore process includes database creation)
- File growth (both manual and auto-grow)
- Backup
Can we avoid the slow zeroing out space? Yes, but only if you are on SQL Server 2005 or higher and for some operations: creation and allocation of data database files (mdf and ndf). The SQL Server service account need to have appropriate permissions from the OS. To be more precise, it need to have a privilege called SE_MANAGE_VOLUME_NAME. This is by default granted to Administrators. Do you run your SQL Server as an account being member of Administrators? I hope not. Did you grant this permission to the service account?
How do I grant this privilege? This is easy. Add the SQL Server service account to the "Perform Volume Maintenance Tasks" security policy.
Does it matter? You be the judge. Just to give you an idea, I created a database with a data file of various size (I had the log file at 1MB for all tests in order for it to influence the least). I timed it both with and without Instant File Initialization. I ran it on my desktop machine which has a RAID0 of two 7200RPM disks:
| Size |
without IFI |
with IFI
|
| 1GB |
10.3 s |
0.3 s |
| 10GB |
128 s |
1.3 s |
| 50GB |
663 s |
4.5 s |
The difference is roughly a factor of 100!
When does it hurt? Basically every time disk storage is allocated. But let us focus of the occasions where you can do anything about it, i.e., when you can have Instant File Initialization. Such occasions include:
- Whenever a database is created. Space need to be allocated for the data file(s).
- Whenever a data file is manually expanded in size.
- Whenever you have auto-grow for a data file. Note that potentially some poor user will now sit and wait for the auto-grow to complete.
- When you start SQL Server. Why? Tempdb need to be re-created.
- When you perform restore, if the destination database not already exists with matching database file structure.
How can I tell if I have Instant File Initialization? I find it easiest to just create a database and time it, using some reasonable size for your data file, like 5GB. Actually, run two test: One with 5GB data file and really small log file. And then another with very small data file and 5GB log file. Remember that we never have Instant File Initialization for log files. For instance, run below and you will see (adjust the file path for the database files). You need to adapt your code for file path name, possibly database name and the datetime handling if you are lower then SQL Server 2008: DECLARE @t time(3) = SYSDATETIME() CREATE DATABASE IFI_test_ld ON PRIMARY (NAME = N'IFI_test', FILENAME = N'C:\IFI_test\IFI_test_ld.mdf', SIZE = 5GB, FILEGROWTH = 100MB) LOG ON (NAME = N'IFI_test_log', FILENAME = N'C:\IFI_test\IFI_test_ld.ldf', SIZE = 1MB, FILEGROWTH = 10MB) SELECT DATEDIFF(ms, @t, CAST(SYSDATETIME() AS time(3))) AS LargeDataFile SET @t = SYSDATETIME() CREATE DATABASE IFI_test_ll ON PRIMARY (NAME = N'IFI_test', FILENAME = N'C:\IFI_test\IFI_test_ll.mdf', SIZE = 3MB, FILEGROWTH = 100MB) LOG ON (NAME = N'IFI_test_log', FILENAME = N'C:\IFI_test\IFI_test_ll.ldf', SIZE = 5GB, FILEGROWTH = 10MB) SELECT DATEDIFF(ms, @t, CAST(SYSDATETIME() AS time(3))) AS LargeLogFile
Are numbers for above two about the same? If yes, then you don't have Instant File Initialization. If the one with large data file is much quicker, then you do have Instant File Initialization. And now you also know approx how long it takes to allocate 1 GB with of data and log file for your SQL Server.
I'm curious: Did you have Instant File Initialization?
|
-
I've often been thinking that "somebody" should produce a list and perhaps some ready-made TSQL code of the SQL Server error messages we want to be alerted for. Something like a Management Pack, but not tied to any particular sys mgmt software. Essentially, what error messages would we typically want to be alerted for, and some ready made TSQL code which defines an alert (msdb.dbo.sp_add_alert, see also http://www.sqlskills.com/BLOGS/PAUL/post/Easy-monitoring-of-high-severity-errors-create-Agent-alerts.aspx) and operator for each of those.
Honestly, how many of you have been going through all SQL Server messages and determined which of those we want to be alerted for? Sure, we can decide to specify alerts for all high-severity level messages (like 19, 20, 21, 22, 23, 24 and 25). But unfortunately the severity classification isn't as coherent and we would hope.
For instance, Paul Randal just recently blogged about error 825, which has as low severity as 10! Do we want to be alerted for 825? You bet! Check out Paul's blog at http://www.sqlskills.com/BLOGS/PAUL/post/A-little-known-sign-of-impending-doom-error-825.aspx.
We can, however make it a little bit easier for us. Since Agent detects the message by sniffing the eventlog, we can immediately discard all messages which aren't written to the eventlog (there's no use defining alert for something which the alerting mechanism never detects, right?):
SELECT * FROM sys.messages WHERE language_id = 1033 AND is_event_logged = 1 ORDER BY severity
Now we are down to 699 messages. It used to be that we could configure whether a message should be written to eventlog (sp_altermessage), but this functionality disappeared in 2005. I haven't checked Connect, but my guess is that if there is such a wish/entry in the first place, it doesn't have too many votes.
My suggestion is that we create some small community project somewhere where we add what messages we want to be alerted for. This can be as simple as a table with error numbers and some insert, and then we use this to generate our sp_add_alert commands based on that. Question is where to do this, so it can be maintained and refined over time? Or if it has already been done by somebody...
|
-
For a long while, I've had a few things I wanted to add for sp_indexinfo (my procedure which returns bunch of information for indexes).
Dejan Sarka suggested adding XML indexes as well as making the column list look nicer. I've also had a request to add some documentation of what the procedure returns. Done. :-)
http://www.karaszi.com/SQLServer/util_sp_indexinfo.asp
|
-
I've been learning a few things about troubelshooting login problems over the last few months. I first want to say that I've learned most of this through other blogs, comments etc. At the end of this blog post you will find an URL list where you can look for more details.
It started a few months back where I somewhere read a blog stating that we can use the state in a "login failed" error message to determine why the login failed. As you can imagine, there can be several reasons for login failure (user doesn't exist, pwd doesn't match, windows login name isn't a trusted login etc). For some reason, I didn't save that URL and of course I needed it some time later and couldn't find it.
One place where you can find such a list of states is... Books Online. BOL 2008 documents every error number, so it is a matter of knowing what error number to search for: 18456. You can also search for "login failed" (pretty easy to remember) and the first hit is a different page (from above) but with similar information. I just tried a search in 2005 BOL for the same and had similar hits. This information might have been added to BOL 2005 in a more recent BOL 2005 release, though.
Now, don't be surprised if the client app only receive state 1 for the login failed messages. This is a good thing, and on purpose. We don't want the system to disclosure too much information to a malicous user (app) about why a login fails. So what we do is to look in the errorlog or for the "real" state (the EventLog doesn't seem to carry this information, for some reason). Below is an example from SQL Server 2005, sp2, with timestamps removed:
Error: 18456, Severity: 14, State: 5. Login failed for user 'gurka'. [CLIENT: <local machine>]
So, state 5 tell us, according to SQL Server 2005 BOL, that "User ID is not valid.".
Now, doing the same in SQL Server 2008 was interesting. Looking in the EventLog, we do see an explanation as to why the login failed. We still don't see the state in the EventLog, but we have an explanation so we might not need the state. And in the errorlog file, I had below message:
Error: 18456, Severity: 14, State: 5. Login failed for user 'gurka'. Reason: Could not find a login matching the name provided. [CLIENT: <local machine>]
So, not only do we have a state, we also have a decription for why the login failed. Is it pathetic that such a thing can make you happy? Perhaps it is... ;-)
Here are some other blog posts on the subject: http://blogs.msdn.com/sql_protocols/archive/2006/02/21/536201.aspx http://jmkehayias.blogspot.com/2008/03/understanding-login-failed-error-18456.html
|
-
-
Fire up your SQL Server 2008 Books Online (BOL) and check the date. You should find this in any article (except for the very start page).
RTM of SQL Server 2008 shipped with a BOL which has ... hmm,... doesn't have a date (just checked in a virtual machine). OK, so if you don't find a date (year month) you have the BOL which shipped with RTM.
The first BOL update was released 2008-09-30. You will see this as (August 2008). This was made available on the web as well as a download (http://www.microsoft.com/downloads/details.aspx?FamilyID=765433f7-0983-4d7a-b628-0a98145bcb97&DisplayLang=en). This BOL did not list what was changed from prior release since this was considered RTM of BOL and the BOL that shipped with RTM was considered RC0 (which explains why this BOL lists Aug 2008 when it was released late Sept 2008).
Today I learned that MS has released yet another release of Books Online. This is only available on the web, and you see this as (October 2008). Now we do have listing of what section are new and what were expended on. I admit that I prefer locally installed BOL, but I guess that MS has resource limitations the same was as all of us do. Check it out at http://msdn.microsoft.com/en-us/library/dd239407.aspx.
|
|
|
|
|
|