Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Speed of sproc to trim columns in a table

Author  Topic 

Abu-Dina
Posting Yak Master

206 Posts

Posted - 2012-08-07 : 12:17:32
Hi there,

I have the following query which basically takes a table name then removed leading/trailing white spaces in all columns.


CREATE procedure [dbo].[usp_TrimColsTest]( @TName varchar(500))
as

declare currentColumn cursor for
select column_name
from information_schema.columns
where table_name=@Tname
order by ordinal_position asc

open currentColumn

declare @cName varchar(500)
declare @sqlstring nvarchar(255)
declare @Tcnt int
DECLARE @ParmDefinition nvarchar(500)


fetch next from currentColumn into @cName
while @@fetch_status=0
begin

set @sqlstring = 'update ['+@Tname+'] set ['+@cname+']=ltrim(rtrim(['+@cname+']))'
SET @ParmDefinition = N'@TcntOUT varchar(30) OUTPUT';
EXECUTE sp_executesql @SQLString , @ParmDefinition, @TcntOut=@Tcnt output

fetch next from currentColumn into @CName
end
close currentColumn
deallocate currentColumn


The above sproc took 20 seconds to run on a table of 34 columns and 250k rows.

If anyone can suggest a "better" way of achieving the same I'd be very much interested.

Thanks for your help in advance.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-08-07 : 12:24:53
i would try to do this at front end application if possible unless you're using this data as join criterias in queries

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2012-08-07 : 13:01:27
I agree with Visakh's suggestion, but if you absolutely need it:
CREATE PROCEDURE [dbo].[usp_TrimColsTest]( @TName VARCHAR(500)) AS
DECLARE @sql NVARCHAR(MAX)='UPDATE ' + QUOTENAME(ISNULL(PARSENAME(@Tname,2),'dbo')) + '.' + QUOTENAME(PARSENAME(@Tname,1)) + ' SET '

SELECT @sql=@sql+REPLACE('?=LTRIM(RTRIM(?)),', '?', QUOTENAME(column_name))
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME=PARSENAME(@Tname,1) AND TABLE_SCHEMA=ISNULL(PARSENAME(@Tname,2),'dbo')
AND DATA_TYPE LIKE '%char' AND COLUMNPROPERTY(OBJECT_ID(@Tname),COLUMN_NAME,'IsComputed')=0

SET @sql=LEFT(@sql,LEN(@sql)-1) -- trim trailing comma from SQL statement

EXEC(@sql)
This updates all character columns in a table with one UPDATE statement, except text/ntext data types (which you should not be using anymore anyway).
Go to Top of Page

Abu-Dina
Posting Yak Master

206 Posts

Posted - 2012-08-07 : 16:46:00
quote:
Originally posted by robvolk

I agree with Visakh's suggestion, but if you absolutely need it


I agree with you and Viskah but this data cleansing is required because the table will be used in a record linkage process.

Unless I'm missing something fundamental in my design but I'm not working on a front end application. I need to

1) Cleanse my data set
2) Standardize the data.
3) Apply record linkage algorithm to generate match scores
4) Produce reports for quality checking
5) Commit the merge to the Single Customer View database.

There will not be a front end initially. The user will call a stored procedure and supply the source table. The outcome will be an email with a report. Another stored procedure will commit the changes.

I hope I'm making sense.

Thanks.
Go to Top of Page

Abu-Dina
Posting Yak Master

206 Posts

Posted - 2012-08-08 : 04:10:59
Nicely done robvolk. Your version runs much MUCH faster!
Go to Top of Page
   

- Advertisement -