Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Dealing with bad input data

Author  Topic 

Tomb
Starting Member

10 Posts

Posted - 2012-07-16 : 22:28:11
Hello,

I'm trying to sum a column that should be an integer, but there appears to be some bad data where the values are text instead (specifically "a0"), so that when I execute my query I get ERROR: invalid input syntax for double precision: "a0".

I've tried using a cast statement, a case when statement, and excluding these rows with a where statement but I always get the same error. How can I exclude these rows and have the query execute properly? I have billions of rows so cannot manually delete them.

Thanks!

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-16 : 22:30:07
use

WHERE field NOT LIKE '%[^0-9]%'

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

Tomb
Starting Member

10 Posts

Posted - 2012-07-17 : 00:40:20
Still get the same error when adding that to the where clause. I think what I'm going to do is first have a subquery where I cast the field as a varchar and exclude the non-integers, then cast it as an integer again in the main query where I do the sum.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-17 : 09:44:11
quote:
Originally posted by Tomb

Still get the same error when adding that to the where clause. I think what I'm going to do is first have a subquery where I cast the field as a varchar and exclude the non-integers, then cast it as an integer again in the main query where I do the sum.


dont cast it until you filter out bad data

so it should be like

SELECT CAST(field as...)
FROM
(
select ...
from table
where field NOT LIKE '%[^0-9]%'
)t


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

Tomb
Starting Member

10 Posts

Posted - 2012-07-17 : 11:16:21
thanks, this worked
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-17 : 11:56:17
welcome

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2012-07-17 : 14:56:07
quote:
Originally posted by Tomb
I have billions of rows




How long does this query take to run?

Can you post the query itself?

Go to Top of Page

Tomb
Starting Member

10 Posts

Posted - 2012-07-18 : 12:22:50
Essentially a modified version of this (field is formatted as a varchar in schema.table):
select x, y, z, sum(cast(field as int)) from
(select x, y, z, field from schema.table where field not like '%[^0-9]%') as a
group by x, y, z order by x, y, z;

I'm actually using Vertica for my data warehouse so given proper indexing/projections and an appropriate where clause it takes maybe 5-10 minutes depending on resources at the time.
Go to Top of Page
   

- Advertisement -