| Author |
Topic |
|
shiyam198
Yak Posting Veteran
94 Posts |
Posted - 2011-03-15 : 13:59:47
|
| Hi,I got a dataset from a client and I need to find all the records that does not contain phone numbers in the right format.Valid Phone numbers in my scenario are:i. All numbers. (4161111111111)ii. Brackets in between (416) 1111111iii. Hyphens in between 416-111-1111 or(416-111-1111) iv. Periods in between 416.123.1234 or(416).123.1234.After googling I found that I need to have a Regex DLL load to my server to check this using regex.Is there any way I can just use SQL to determine if a column has data that contains something other than numbers, periods, and brackets?Thanks for your time.Regards,Shiyam |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2011-03-15 : 14:08:30
|
Tara has the right answer. But, if you need to parse these strings one method is to use a LIKE comparison. Maybe this will get you started:INSERT @T (PhoneNumber) VALUES('4161111111'),('(416)1111111'),('416-111-1111'),('416.123.1234'),('1234'),('asdafg')SELECT *FROM @TWHERE NOT ( PhoneNumber LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' -- i. All numbers. (4161111111111) OR PhoneNumber LIKE '([0-9][0-9][0-9])[0-9][0-9][0-9][0-9][0-9][0-9][0-9]' --ii. Brackets in between (416) 1111111 OR PhoneNumber LIKE '[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]' --iii. Hyphens in between 416-111-1111 OR PhoneNumber LIKE '[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9][0-9]' --iv. Periods in between 416.123.1234 ) |
 |
|
|
shiyam198
Yak Posting Veteran
94 Posts |
Posted - 2011-03-15 : 14:09:16
|
| Thanks for your reply Tara.This is just ETL. I am in the stage, where I need to have a look at the data set and let the client know which fields contain bad data and client has the opportunity to clean up the data.Our App by default does not allow any phone number other than the format specified. Please let me know if there is a way to find the records that contains phone numbers that are not in the mentioned format.Thanks again for your timeRegards,Shiyam |
 |
|
|
shiyam198
Yak Posting Veteran
94 Posts |
Posted - 2011-03-15 : 14:18:12
|
| Thanks Lamprey. I just saw your reply.My problem is -i. Any amount of numbers in the phone numbers is valid. So, 416123444555566543939849348394843 is a valid phone numberii. Brackets can be any where. So, 416(123)12(23)(1234)iii. Periods can be any where. So, 416.123.12.23.1234.At a time, we might be loading more than a million customer records, several will fail for different reasons. So get rid of records with obvious bad data. yes. This is crazy. I guess I should get the R&D team do better validation or allow more flexibility.But, if you have an idea to deal with it for now, please let me know.Again, thanks for your time.Regards,Shiyam |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2011-03-15 : 14:33:37
|
| SELECT *FROM @tWHERE PATINDEX( '%[^0-9]%',PhoneNumber) > 0JimEveryday I learn something that somebody else already knew |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2011-03-15 : 14:46:29
|
quote: Originally posted by shiyam198 Thanks Lamprey. I just saw your reply.My problem is -i. Any amount of numbers in the phone numbers is valid. So, 416123444555566543939849348394843 is a valid phone numberii. Brackets can be any where. So, 416(123)12(23)(1234)iii. Periods can be any where. So, 416.123.12.23.1234.<snip>
Not to be flip, but to get you thinking... What are your actual requirements?Are you saying the requirement is to allow any numerical digit (0-9), parenthsis ("(" and ")"), periods (".") and dashes ("-") anywhere in a string? If not, can you state the actual requirement? Here is a sample based on my requirement question above. But, I still thikt he real answer is to strip everything out: SELECT *FROM @TWHERE PhoneNumber NOT LIKE '%[^0-9().-]%' |
 |
|
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2011-03-15 : 23:25:42
|
quote: Originally posted by shiyam198 Thanks Lamprey. I just saw your reply.My problem is -i. Any amount of numbers in the phone numbers is valid. So, 416123444555566543939849348394843 is a valid phone numberii. Brackets can be any where. So, 416(123)12(23)(1234)iii. Periods can be any where. So, 416.123.12.23.1234.At a time, we might be loading more than a million customer records, several will fail for different reasons. So get rid of records with obvious bad data. yes. This is crazy. I guess I should get the R&D team do better validation or allow more flexibility.But, if you have an idea to deal with it for now, please let me know.
Strip out ALL punctuation, then decide to use only the International telephone number format -- I think it is 18 digits, but I am drunk and cannot remember You use an external validation program from several companies and from Open Source regular expression tools. If you do this in SQL, all you can test it for all digits. If you want local display formats, they are in another table. --CELKO--Books in Celko Series for Morgan-Kaufmann PublishingAnalytics and OLAP in SQLData and Databases: Concepts in Practice Data, Measurements and Standards in SQLSQL for SmartiesSQL Programming Style SQL Puzzles and Answers Thinking in SetsTrees and Hierarchies in SQL |
 |
|
|
|