Recently I was again visited by my old friends TCP Chimney and SynAttackProtect. (Yeah, sometimes I feel like I mostly blog about 5-year old problems, but many of us as DBA's have to work on older versions or older systems, and so repeat older problems :-).
This has been written about before, but as I BinGoogled around I noticed you are more likely to find the documents if you search for the cause, and not the symptoms. Most people who face a problem, of course, know the symptoms but not the cause. So here's a recap, if only for the search engines:
If:
- You have a busy SQL Server
- Your OS is Windows Server 2003
- Busy is defined by many connections and a decent amount of network activity
And you see these types of network-related errors:
- SQL Agent jobs that run DBCC CheckDB intermittently fail with "TCP Provider: The semaphore timeout period has expired. [SQLSTATE 08S01] (Error 121) Communication link failure [SQLSTATE 08S01] (Error 121). The step failed."
- The SQL Agent log contains errors like "[298] SQLServer Error: 64, Communication link failure [SQLSTATE 08S01]"
- [298] SQLServer Error: 258, TCP Provider: Timeout error [258]. [SQLSTATE 08001]
- [298] SQLServer Error: 64, TCP Provider: The specified network name is no longer available. [SQLSTATE 08S01]
- Other long-running SQL processes die with network failures.
Then you might be visited by the dreaded features SynAttackProtect or TCP Offloading (aka Chimney). I know just enough about networking to be dangerous, but let me see if I can explain these and not embarass myself:
SynAttackProtect is a feature that was added to Windows Server 2003 in Service Pack 1 with the idea that the server would automatically protect itself against certain kinds of TCP flooding attacks. Unfortunately, clients making a lot of connections to SQL Server apparently look just like such an attack, so your server ends up protecting itself against your clients. You have to add a registry entry to suppress this protection and allow the large number of connections (provided your server is on a protected network behind a firewall, and is unlikely to actually be attacked in this way.)
TCP Offloading is a neat-o, whizbang feature of some network cards that attempts to move some part of network activity/processing off of the CPU to the network card itself. Sounds nice, but unfortunately this wreaks havoc with a SQL Server 2005 server. Why, I don't exactly know, but it's common enough that many DBAs simply ban this feature from their servers. Luckily it's not difficult to suppress.
Changing these behaviors is fairly simple: you suppress SynAttackProtect by adding or adjusting a registry key and suppress the TCP offloading with a netsh command or registry change.