Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
xyxoxy
Starting Member
4 Posts |
Posted - 2010-07-02 : 12:10:05
|
Hi folks.We have a 2 server setup where Server1 acts as publisher and distributor of transactional replication of our production DB (which is also on Server1). Server2 is the subscriber and pulls from the distributor to the replicated DB on Server2 which is used only for reporting. The DB is just over 4GB if that is relevant and we replicate just about every table and view. Both servers have plenty of resources and are running the latest service packs of SQL2005 and Windows server 2003.Yesterday, for the second time in 6 months, the subscription basically stopped pulling transactions from the distributor for no apparent reason. I could see that the publisher was sending transactions to the distributor but the subscriber was not applying them to the replicated DB. This happened in the middle of the day so no maintenance jobs or updates were running... just normal business. The ONLY thing I saw to possibly explain it was a request to the replicated DB which took over 2 minutes and was timed out by the application server. This happens occasionally with large reports but doesn't impact replication.I looked at every log file I could find as well as replication monitor and could see no reason why replication was stopped... but I confirmed changes were not carrying over and the TEMPDB on Server2 was growing to about 1GB. I finally killed and recreated the replication from scratch and everything is fine now.Can anyone provide any tips on troubleshooting this type of thing should it happen again? Is there any place I should look that I haven't already? I looked at the SQL Server and Agent logs as well as Application and System logs on both DB servers plus our application web server and saw nothing out of the ordinary. |
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-07-07 : 11:24:48
|
was the distribution agent stopped?why are you using pull instead of push? what is the schedule for the agent? |
|
|
xyxoxy
Starting Member
4 Posts |
Posted - 2010-07-12 : 09:07:59
|
quote: Originally posted by russell was the distribution agent stopped?why are you using pull instead of push? what is the schedule for the agent?
The distribution agent was not stopped.We are using pull to put any load on the subscriber server instead of the publication server.The schedule is continuous. |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-07-12 : 09:21:54
|
see anything in distribution..msrepl_errors?is distributor on the publisher, or dedicated machine? |
|
|
xyxoxy
Starting Member
4 Posts |
Posted - 2010-07-12 : 17:55:51
|
The distributor is on the publisher.I did not know to look in that table... this is the kind of info I'm looking for so thanks for that.Unfortunately I see no errors that explicitly correspond to this problem. I became aware of the issue at 3:02PM (via a scheduled task that manually tests replication every hour). The first error I see is at 3:22PM - "if @@trancount > 0 rollback tran" and immediately after that is "Query timeout expired" with an error code of "HYT00".So these may be related but they were logged well after the problem started and may be around the time I started troubleshooting. |
|
|
|
|
|