Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 SQL Freezes - Login Failed for NT AUTHORITY\SYSTEM

Author  Topic 

fairbro
Starting Member

4 Posts

Posted - 2010-09-08 : 05:29:56
Hi,

I've been having some freezing issues on SQL server 2005 SP3 and currently the error I'm trying to troubleshoot is "Login failed for user 'NT AUTHORITY\SYSTEM'. [CLIENT:<local machine>]

I'm running w2k8 server x64 on a Proliant DL360G5.
I have tried switching the user sql uses and switching back again to network service to get rid of this error which appears to be SQL using an account with incorrect credentials (mismatch between NT Authority\System and what SQL is expecting. http://blogs.msdn.com/b/sql_protocols/archive/2006/02/21/536201.aspx)

The server has approx 8 db's mirrored onto it and is SAN attatched and was experiencing freezing.
I checked the hardware - no issue.
I re-applied the SQL 2005 SP3.
I enabled tracing which showed errors related to performance - deadlocks have been noted so I enabled DBCC TRACEON (1204, 1222) and some stack dumps can be seen.
I have upgraded to w2k8 sp1 (and re-applied sql 2005 sp3).

However the only error in the log files is the one previously mentioned. "Login failed for user 'NT AUTHORITY\SYSTEM'. [CLIENT:<local machine>].

I have been using SQLIO to test throughput to the SAN incase the latency or errors are related to this and have also been using iometer.

Next steps for my issue are to try and fix the error Login Failed...
If this doesn't help I'm going to re-install SQL.
If this doesn't help I'm going to probably have to get new switches.

Can anyone help me with this issue!

Thanks,

Willie

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-09-08 : 11:42:48
The error you are seeing is not related to your freezing issues. That error just means that an attempt was made to connect to SQL Server, but SQL Server did not receive an account to authenticate. You can get that error when you attempt to query a linked server using Windows authentication and Kerberos isn't in place. We use SQL authentication for linked servers as a result. We also see that error when MOM (a MS software monitoring package) doesn't have access to SQL Server and is attempting to login. We resolved the error by giving it access.

Could you explain "freezing"? Is the server locked up? Is performance just dreadful? What does PerfMon show for CPU, etc? How about SQL Profiler?

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

fairbro
Starting Member

4 Posts

Posted - 2010-09-09 : 05:07:40
The server has plenty RAM *16gb (maxmem 14gb), and ample CPU available. If that login error can be ignored then it's most probably the congestion between the server and the SAN which is connected via iSCSI via a cisco 3750 which is setup to run at 100Mb. I was reading that sql average disk read or current disk length should not be greater than 5. The queue length under a number of different scenarios I ran through with SQLIO were much higher than that which was a cause for conern. I also monitored consumption of bandwidth inside the storage which seemed fine...(HP MSA 2020, over 3 disks). However network usage over the switch didn't seem fine and on a 100 Mb link appears to peak on a number of occassions (i.e. running at > 7.2Mb) whilst the higher end tests were running.

My boss suggested that a valid test to try and he suggested generating a 400 Gb file, and then using a SQL insert statement then monitoring current disk and average disk queue length which doing a full rebuild/index then a backup. What do you consider to be a valid test which may point towards the bottleneck and you dont think that un-installing \ re-installing sql will help? Have you ever seen a misconfigured mirror/witness setup trigger freezing?

See dump of SQL io results below (M:\ is the SAN Operatinoal Database LUN).

C:\Program Files (x86)\SQLIO>sqlio -kW -t2 -s120 -dM -o1 -frandom -b32 -BH -LS Testfile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
2 threads writing for 120 secs to file M:Testfile.dat
using 32KB random IOs
enabling multiple I/Os per thread with 1 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 2048 MB for file: M:Testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 239.14
MBs/sec: 7.47
latency metrics:
Min_Latency(ms): 4
Avg_Latency(ms): 7
Max_Latency(ms): 169
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 7 28 45 18 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C:\Program Files (x86)\SQLIO>sqlio -kW -t2 -s120 -dM -o2 -frandom -b32 -BH -LS Testfile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
2 threads writing for 120 secs to file M:Testfile.dat
using 32KB random IOs
enabling multiple I/Os per thread with 2 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 2048 MB for file: M:Testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 295.20
MBs/sec: 9.22
latency metrics:
Min_Latency(ms): 5
Avg_Latency(ms): 13
Max_Latency(ms): 1991
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 2 15 23 24 15 6 1 2 1 2 1 0 1 2 1 2

C:\Program Files (x86)\SQLIO>sqlio -kW -t2 -s120 -dM -o4 -frandom -b32 -BH -LS Testfile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
2 threads writing for 120 secs to file M:Testfile.dat
using 32KB random IOs
enabling multiple I/Os per thread with 4 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 2048 MB for file: M:Testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 87.63
MBs/sec: 2.73
latency metrics:
Min_Latency(ms): 9
Avg_Latency(ms): 90
Max_Latency(ms): 1515
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99

C:\Program Files (x86)\SQLIO>sqlio -kW -t2 -s120 -dM -o8 -frandom -b32 -BH -LS Testfile.dat
sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
2 threads writing for 120 secs to file M:Testfile.dat
using 32KB random IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 2048 MB for file: M:Testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 85.36
MBs/sec: 2.66
latency metrics:
Min_Latency(ms): 8
Avg_Latency(ms): 186
Max_Latency(ms): 10670
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100

Thanks, Willie


Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-09-09 : 11:28:47
You haven't explained what you mean by freezing yet.

Uninstalling is almost never a solution.

Have you run SQL Profiler to check for long-running queries and high reads? If by freezing you mean the CPU is pegged, then it's likely you are missing indexes. Missing indexes can cause huge IO issues. Or if by freezing you mean the server locks up, well then I'd suggest opening a case with Microsoft as you'll need to get a memory dump generated when it occurs and then analyzed by them.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

fairbro
Starting Member

4 Posts

Posted - 2010-09-13 : 05:58:02
Hi Tara,

That's really helpful thanks. When I said freezing I mean that the SQL engine became unresponsive. CPU remained fine, and there is memory available, it seems more likely (to me) that it might be IO related to the disks which are on the SAN (that contain the db's). The servers themselves are pretty new, and under little stress.

So for example when trying to create backups of the DB's using backup exec (which tried to read all teh DB's quickly for a snapshot) caused them to freeze, the backups would fail, sql would become unresponsive/crash and the only way to get it back would be to restart SQL server. That's why I started to try and troublshoot with dbcc traces which revealed some stack dumps.

Haven't used SQL profiler yet - can you provide some details if you think it'd be helpful to try and troubleshoot this issue?

Thanks,

Willie



Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-09-13 : 10:58:27
If it's an IO issue, you'd see a perf problem in Performance Monitor. What does avg disk io/sec for read and writes show? It should be under 12ms.

For SQL Profiler, I'd look for long-running queries and high reads.

If you are seeing stack dumps, then you likely are encountering a big issue that Microsoft will need to help with. What CU are you running for sp3? I'd get on the latest before calling MS as stack dumps can be SQL bugs.



Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

fairbro
Starting Member

4 Posts

Posted - 2010-09-13 : 11:21:36
Hi Tara,

What do you mean by CU for SP3? I'll check io/sec and see what times they come in under whilst running some SQL i/o tests, or running a backup. I had been checking average disk queue length which was > 100 at some points for current disk queue length and average disk queu length.
Thanks, Willie

Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-09-13 : 12:53:16
CU stands for cumulative update package. After SP3, there have been hundreds/thousands of bugs fixed, and they've been rolled into a CU. I'm running CU8 for SP3 on my production systems. I believe CU11 is available though.

SP3 will be build 4053. Check SELECT @@VERSION.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page
   

- Advertisement -