|
Anticipating Primary Key ViolationsBy Guest Authors on 23 November 2003 | Tags: INSERT This article comes to us from Karen Gayda. She introduces a script she's written to catch primary key violations before they are inserted into a table. IntroductionYears ago when I was just breaking into the programming field I worked with Paradox databases. While Paradox pales in comparison to a database engine like SQL Server, it did have one interesting feature that has not yet been duplicated by Microsoft. When performing an insert into a table, any key violation records would be shunted to a violation table while the good records would be inserted into the intended table. This feature allowed for graceful error handling and for tracking of bad data. Recently I encountered a situation at one of my clients that required that I gracefully handle duplicate data being sent by customers. The goal was to allow the good data to be processed while capturing the bad data so that the customers could be notified of their data issues thus enabling them to correct transmission problems. Having a Paradox-style key violation table would be ideal to handle this problem. Since SQL Server does not include this functionality I decided I needed to write a generic routine to somewhat mimic Paradox. Stored Procedure ScriptThis script creates a stored procedure that evaluates whether a select statement will cause Primary Key violations if data is inserted into a given target table. Key violation rows are saved to the designated violation table in the target database. (You can download the source code for this script and the required functions here.) Calling this stored procedure prior to actually executing a select statement will cause the key violation keys to be saved to a table. Doing this allows for excluding those rows from being inserted if a where clause is added that checks for the existence of the key violation records. SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO
IF EXISTS(SELECT * FROM dbo.sysobjects
WHERE id = object_id(N'[dbo].[sp_GetKeyViolations]')
AND OBJECTPROPERTY(id, N'IsProcedure') = 1
)
DROP PROCEDURE [dbo].[sp_GetKeyViolations]
GO
CREATE PROCEDURE dbo.sp_GetKeyViolations
@vcSelectStatement varchar (8000),
@sysTargetTable sysname,
@sysViolationTable sysname,
@sysTargetDatabase sysname,
@bitEmptyViolationTable bit = 0
AS
SET NOCOUNT ON
DECLARE @vcSQL varchar(8000),
@vcColumnList varchar(300),
@sysColName sysname
--Get the primary key columns in the target table
SET @vcSQL = 'DECLARE crCols CURSOR STATIC FOR SELECT DISTINCT
name FROM ' + @sysTargetDatabase +
'.dbo.PrimaryKeyColumns(''' + @sysTargetTable + ''')'
EXEC(@vcSQL)
OPEN crCols
--For Each Column in PrimaryKey, concatenate to column list
SET @vcColumnList = ''
FETCH NEXT FROM crCols INTO @sysColName
WHILE(@@FETCH_STATUS = 0)
BEGIN
SET @vcColumnList = @vcColumnList + @sysColName + ','
FETCH NEXT FROM crCols INTO @sysColName
END --Get next column
SET @vcColumnList = SUBSTRING(@vcColumnList,1,LEN(@vcColumnList) -1) --remove trailing comma
CLOSE crCols
--Get selected rows and save to a temporary table
IF EXISTS (select * from sysobjects where id = object_id('##tempSourceRecords')
and OBJECTPROPERTY(id, N'IsTable') = 1)
DROP TABLE ##tempSourceRecords
SET @vcSQL = (SELECT Element FROM master.dbo.Split(@vcSelectStatement, 'FROM')
WHERE ElementID = 1)
+ ' INTO ##tempSourceRecords FROM '
+ (SELECT Element FROM master.dbo.Split(@vcSelectStatement, 'FROM')
WHERE ElementID = 2)
PRINT(@vcSQL)
EXEC (@vcSQL)
--If key violation table does not already exist, create it
DECLARE @nvcSQL nvarchar(4000),
@nvcParmDef nvarchar(500),
@bitExists bit
SET @nvcSQL = 'IF EXISTS (select * from '+ @sysTargetDatabase+ '.dbo.sysobjects
where name = ''' + @sysViolationTable + ''' AND
xtype=''U'') ' +
'SET @bitExistsOUT = 1 ' +
'ELSE ' +
'SET @bitExistsOUT = 0'
SET @nvcParmDef = N'@bitExistsOUT bit OUTPUT'
EXEC sp_executesql @nvcSQL, @nvcParmDef, @bitExistsOUT=@bitExists OUTPUT
IF @bitExists = 0
BEGIN
--Create new table based upon supplied select list if needed
SET @vcSQL = 'SELECT TOP 0 ' + @vcColumnList +
+ ' INTO ' + @sysTargetDatabase + '.dbo.'+ @sysViolationTable +
' FROM ##tempSourceRecords '
EXEC (@vcSQL)
END
--Empty violation table if caller indicates it should be cleared
IF @bitEmptyViolationTable = 1
BEGIN
SET @vcSQL = 'DELETE FROM ' + @sysTargetDatabase + '.dbo.' + @sysViolationTable
PRINT @vcSQL
EXEC (@vcSQL)
END
--Insert duplicate rows from target into key violation table
SET @vcSQL = 'INSERT INTO ' + @sysTargetDatabase + '.dbo.' + @sysViolationTable +
' SELECT a.' + REPLACE(@vcColumnList, ',', ',a.') +
' FROM ##tempSourceRecords a' +
' INNER JOIN ' + @sysTargetDatabase + '.dbo.' + @sysTargetTable + ' b ' +
' ON '
OPEN crCols
FETCH NEXT FROM crCols
INTO @sysColName --get inner join columns to add to SQL insert statement
WHILE(@@FETCH_STATUS = 0)
BEGIN
SET @vcSQL = @vcSQL + 'a.' + @sysColName + '=b.' + @sysColName + ' AND '
FETCH NEXT FROM crCols INTO @sysColName
END --Get next column
SET @vcSQL = SUBSTRING(@vcSQL,1,LEN(@vcSQL) -4) --remove trailing AND
--Add rows that are duplicate in the source select that may cause violation
SET @vcSQL = @vcSQL + ' UNION SELECT ' + @vcColumnList + ' FROM ##TempSourceRecords ' +
'GROUP BY ' + @vcColumnList + ' HAVING COUNT(*) > 1'
CLOSE crCOLS
DEALLOCATE crCols
PRINT @vcSQL
EXEC (@vcSQL)
DROP TABLE ##tempSourceRecords
RETURN (0)
GO
Listing 1 – sp_GetKeyViolations Code Parameters
UsageThis stored procedure should be called prior to actually executing a SELECT statement. For example, run the following statement from the master database (once the procedure and supporting functions have been created): exec sp_GetKeyViolations 'SELECT TOP 10 CustomerID as CustomerID FROM Northwind.dbo.Customers WHERE ContactTitle LIKE ''Sales%''', 'Customers', --target table 'DupCustomers', -- table to store violation rows 'NORTHWIND', -- target database 0 -- do not clear violation table prior to insert Now the duplicate records can be excluded by adding a check to the key violation table as follows: SELECT TOP 10 CustomerID as CustomerID FROM Northwind.dbo.Customers C WHERE ContactTitle LIKE 'Sales%' AND NOT EXISTS( SELECT 1 FROM Northwind.dbo.DupCustomers D WHERE C.CustomerID = D.CustomerID) DependenciesSp_GetKeyViolations calls 2 user-defined functions, dbo.Split and dbo.PrimaryKeyColumns. dbo.Split should be installed in the master database. dbo.PrimaryKeyColumns should be installed in any database that will be used as a target database. If this sp will be used extensively then I recommend adding dbo.PrimaryKeyColumns to the model database to reduce administration. The code for these functions is included in the source code I created sp_GetKeyViolation in the master database. It is written so that it can be called from any DB for which a user has SP execute permissions. ConclusionSince duplicate data is a common problem, this procedure addresses the need to identify bad data without hampering the insert process in a very generic fashion.
|
- Advertisement - |