THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Maria Zakourdaev

Transactional replication: are you sure your data is totally synchronized?

 

There are those rare times, when your replication solution is working perfectly. No consistency errors, no errors at all.
Or so it seems.

Then, all of a sudden, you catch one application which is writing to the read-only subscriber. Next, a developer complains that some data is missing from the subscriber.

Similarly, you always wonder, after the consistency errors have appeared and solved, whether the data is synchronized. In some specific cases, the publication can be dropped by mistake and re-created with replication continuing without reporting any errors.

Is there a native way to make sure all data is synchronized between the publisher and the subscriber?

Validate subscriptions using Replication Monitor

If you open the Replication Monitor application and right click on your publication, you will notice the “Validate subscriptions…” feature.

image

Click on it, choose the subscription that requires validation and select the way validation will be performed.

 image

These options are quite critical since they directly impact the speed of the validation process. Here you can choose whether the tables themselves will be queried or row number information will be extracted from the sysindexes system view.

The default option is to compare the numbers from sysindexes and compare actual row counts only in case the differences were found.

You can also choose to compare the checksums of the data. In this case, the validation process will take a long time.

The result of the comparison will appear in the distribution service details:

clip_image005

You will see all tables that were scanned by the validation process and the row counts.

clip_image007

If a table is out of sync you will see the error as below

clip_image009

 

What’s going on behind the scenes?

Here is the query that is being executed against each table in the database if you choose to compare actual count and the checksum:

SELECT count_big(*), sum (convert(numeric, binary_checksum(*) ) )

FROM <tablename>  WITH (TABLOCK HOLDLOCK)

 

All the above is nice when you have only one publication to validate. What if you have hundreds of publications?

As you might have expected, this UI is executing the stored procedure on the publisher. The stored procedure is called sp_publication_validation and it’s main parameters are

  • @rowcount_only

§  1 - Perform a rowcount check only  ( default )

§  2 - Perform a rowcount and binary checksum

  • @full_or_fast

§  0 – Full count using count(*)

§  1 – Fast count using sysindexes view

§  2 – Compare the subscriber and publisher using the fast count and, if the results are not equal, uses count on table. If the rows field in sysindexes is NULL full count will be used. ( default )

 

USE <published database>

GO

EXEC sp_publication_validation  @publication =  'PublicationName'

             ,  @rowcount_only =  2

             ,  @full_or_fast =  2

You can execute this stored procedure over all publications on the server.

I have put together a simple script to do this, in order to use it you need to define a linked server to the Distributor server in order to get a list of publications from the MSpublications table which is located in the Distribution database.

 

CREATE PROCEDURE ValidateSubscriptions

                        @PrintMode  int = 0

AS

BEGIN

     

      DECLARE @SQLCmd  varchar(max);

     

      SET @SQLCmd= ''

 

      SELECT @SQLCmd = @SQLCmd + 'EXEC ['+publisher_db+'].dbo.sp_publication_validation 

      @publication = '''+publication+''',

      @rowcount_only =  2,

      @full_or_fast =  2;' + CHAR(10) + CHAR(13) +

      ' WAITFOR DELAY ''00:01:00'';' + CHAR(10) + CHAR(13)

      FROM DISTRIBUTOR.DistributionDB.dbo.MSpublications p

            JOIN DISTRIBUTOR.master.sys.servers s

                  ON p.publisher_id = s.server_id 

      WHERE s.name =  @@servername

      IF @PrintMode = 1 BEGIN

            PRINT  @SQLCmd

      END

     

      EXEC (@SQLCmd)

 

END

All the validation results (for all publications) will be inserted into the MSdistribution_history table which is located in the Distribution database on the Distributor server.

Viewing it can be done using the following query:

 

USE DistributionDB

GO

 

SELECT      time              Time,

            s.Name            PublisherServer,

            a.publication     Publication,

            s2.name           SubscriberServer,

SUBSTRING(comments,8,CHARINDEX('''',comments,8)-8) TableName,

            comments          Comments

      FROM dbo.MSdistribution_history  h

       JOIN dbo.MSdistribution_agents   a

         ON h.agent_id = a.id

                  JOIN master.sys.servers s

                    ON a.publisher_id = s.server_id

                        JOIN master.sys.servers s2

                          ON a.subscriber_id = s2.server_id        

      WHERE comments like 'Table %'

The best part is that if any article fails validation you will immediately see it in the error log

 

EXEC master..xp_readerrorlog 0,1,'failed data validation'

clip_image011

The above means that if you have some usual routine that notifies you in case there are any unusual errors in the SQL Server Error Log, it will show validation failures as well.

 

May all your articles pass their validation successfully.

Yours, Maria

Published Tuesday, March 06, 2012 5:47 PM by Maria Zakourdaev

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Mike said:

Have you looked into database sharding techniques? This is a great video explaining sharding http://www.dbshards.com/rightscaleuser2010/

Hope this helps out!

Mike

March 7, 2012 2:25 PM
 

Maria Zakourdaev said:

Thank you Mike, interesting link. Just I'm not quite sure how it is connected to the topic of this blog post.

March 8, 2012 4:44 AM
 

Nick said:

/*

Hi Maria!

Just starting reading your blogs, lots of fun~~~

While this can beat-up your servers pretty bad, depending upon what you pass it as a parameter its a nifty tool to use: tablediff.exe

Its built right into SQL, normally in the C:\Program Files\Microsoft SQL Server\(version)\COM  folder.

By adjusting the below query you can run this against multiple publications/subscriptions (by adding in another loop *gasps at horror at myself for using loops* :P )

The cool thing about this, is that it does not need replication. Think of it as a built in table comparison/synchronization tool.

for articles, trying to fix a few data errors can be annoying and tricky to track down... but this will do it for you ;-)

*/

--execute the output from the COM folder. ie:  C:\Program Files\Microsoft SQL Server\90\COM or whatever version of SQL.

--fill in the relevant variables and have fun!

SET NOCOUNT ON

USE [] --database hosting publication.

DECLARE

@vcSourceServer VARCHAR(50)

,@vcDestinationServer VARCHAR(50)

,@vcSourceDatabase VARCHAR(200)

,@vcDestinationDatabase VARCHAR(200)

,@vcTableName VARCHAR(200)

,@cTableExtension CHAR(5)

SET @vcSourceServer = ''

SET @vcDestinationServer = ''

SET @vcSourceDatabase = ''

SET @vcDestinationDatabase = ''

SET @cTableExtension = ''

DECLARE file_cursor CURSOR FOR

SELECT [name] FROM sysobjects

WHERE xtype = 'U'

AND [name] IN

(

SELECT article from distribution.dbo.MSarticles

WHERE publisher_db = ''

)

ORDER BY [name] ASC

OPEN file_cursor

  FETCH NEXT FROM file_cursor

  INTO @vcTableName

WHILE @@FETCH_STATUS = 0

BEGIN  

SET @cTableExtension = (SELECT SUBSTRING(@vcTableName, 1, 5))

PRINT 'tablediff.exe -sourceserver '+@vcSourceServer+' -sourcedatabase '+@vcSourceDatabase+' -sourceschema dbo -sourcetable '+@vcTableName+' -destinationserver '+@vcDestinationServer+' -destinationdatabase '+@vcDestinationDatabase+' -destinationschema dbo -destinationtable '+@vcTableName+' -f C:\Fix_'+@vcTableName+'.txt -o C:\Error_'+@vcTableName+'.txt

'

  FETCH NEXT FROM file_cursor

  INTO @vcTableName

END

CLOSE file_cursor

DEALLOCATE file_cursor

SET NOCOUNT OFF

October 2, 2013 10:09 AM

Leave a Comment

(required) 
(required) 
Submit

About Maria Zakourdaev

The shortest word in the English language that contains the letters: abcdef is… feedback! Let me know if I have touched something that is alive in the cosmos.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement