Author |
Topic |
rockstar283
Yak Posting Veteran
96 Posts |
Posted - 2013-09-17 : 15:55:15
|
Hey everyone..I would like to start a discussion to understand what are the different quickest and most efficient ways to handle the following scenarios:1) How to move trillion recordsa) From one SQL Server table to another SQL Server tableb) From SQL Server table to Excel or CSV filec) From Excel or csv file into SQL server table2) How to count trillion records in SQL server Please provide any details if you can..Thank you!! |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2013-09-17 : 16:15:22
|
1) DELETE FROM dbo.Table1 OUTPUT deleted.*a) DELETE FROM dbo.Table1 OUTPUT deleted.* INTO dbo.Table2b) Excel cannot support 1 trillion rows.c) OPENROWSET2) SELECT COUNT(*) Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-09-17 : 17:16:10
|
My answers.1) SSIS2) SSIS3) SSIS (Excel cannot support 1 trillion rows).4) SSIS 5)SELECT SUM (row_count) FROM sys.dm_db_partition_stats where (index_id=0 or index_id=1)and object_name(object_id) = 'YourTableName' |
|
|
sigmas
Posting Yak Master
172 Posts |
Posted - 2013-09-17 : 17:22:43
|
quote: Originally posted by James K My answers.1) SSIS2) SSIS3) SSIS (Excel cannot support 1 trillion rows).4) SSIS 5)SELECT SUM (row_count) FROM sys.dm_db_partition_stats where (index_id=0 or index_id=1)and object_name(object_id) = 'YourTableName'
I agree with you for how counting the rows in a table by statistics. |
|
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-09-17 : 17:26:28
|
quote: Originally posted by sigmas
quote: Originally posted by James K My answers.1) SSIS2) SSIS3) SSIS (Excel cannot support 1 trillion rows).4) SSIS 5)SELECT SUM (row_count) FROM sys.dm_db_partition_stats where (index_id=0 or index_id=1)and object_name(object_id) = 'YourTableName'
I agree with you for how counting the rows in a table by statistics.
Jeez, sigmas, you are agreeing with me only on 20% of my answers. Does this mean you are diasgreeing with me on the other answers? Some interesting reading: http://technet.microsoft.com/en-us/library/dd537533(v=sql.100).aspx |
|
|
sigmas
Posting Yak Master
172 Posts |
Posted - 2013-09-17 : 17:36:21
|
" Does this mean you are disagreeing with me on the other answers? "No, I am not. I am not disagreeing with you on the others.It means I have not any opinion about others. just this :) |
|
|
rockstar283
Yak Posting Veteran
96 Posts |
Posted - 2013-09-18 : 04:58:48
|
Answers I have found so far:Quickest way to move the data:1) SSIS2) DELETE FROM dbo.Table1 OUTPUT deleted.* INTO dbo.Table23) SELECT * INTO dbo.Table2 from dbo.Table2 where 1=0Quickest way to count the number of records: 1) SP_Spaceused Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database. USE AdventureWorks2012;GOEXEC sp_spaceused N'Purchasing.Vendor';GO 2) Count(*) 3) select SCHEMA_NAME(ST.schema_id)+'.'+ST.name TableName,SP.rows RowCntfromsys.tables STinner join sys.partitions SPon (ST.object_id = SP.object_idand SP.index_id in (0,1))order byRowCnt desc 4) SELECT SUM (row_count) FROM sys.dm_db_partition_stats where (index_id=0 or index_id=1) and object_name(object_id) = 'YourTableName' 5) select Count_Big(*) from HumanResources.EmployeeThanks to all who helped me to formulate above answersif anyone else has any other thoughts..please pitch in:-) |
|
|
LoztInSpace
Aged Yak Warrior
940 Posts |
Posted - 2013-09-18 : 08:45:59
|
quote: Originally posted by sigmas
quote: Originally posted by James K My answers.1) SSIS2) SSIS3) SSIS (Excel cannot support 1 trillion rows).4) SSIS 5)SELECT SUM (row_count) FROM sys.dm_db_partition_stats where (index_id=0 or index_id=1)and object_name(object_id) = 'YourTableName'
I agree with you for how counting the rows in a table by statistics.
I disagree with this if you want a consistent and accurate answer.- If it's not transactionally accurate/consistent then it's not the proper count and therefore irrelevant (or at least suspect).- If it is accurate, then COUNT(*) would surely be implemented in terms of that anyway. If so, then count(*) would therefore be better because it's going to give you the correct answer in a standard way.If you want a "it might have had something like this many rows at some point in time, maybe and I don't care if I freak out future developers" then go for it. |
|
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-09-18 : 10:39:49
|
quote: Originally posted by LoztInSpace
quote: Originally posted by sigmas
quote: Originally posted by James K My answers.1) SSIS2) SSIS3) SSIS (Excel cannot support 1 trillion rows).4) SSIS 5)SELECT SUM (row_count) FROM sys.dm_db_partition_stats where (index_id=0 or index_id=1)and object_name(object_id) = 'YourTableName'
I agree with you for how counting the rows in a table by statistics.
I disagree with this if you want a consistent and accurate answer.- If it's not transactionally accurate/consistent then it's not the proper count and therefore irrelevant (or at least suspect).- If it is accurate, then COUNT(*) would surely be implemented in terms of that anyway. If so, then count(*) would therefore be better because it's going to give you the correct answer in a standard way.If you want a "it might have had something like this many rows at some point in time, maybe and I don't care if I freak out future developers" then go for it.
Can you show an example of where sys.dm_db_partition_stats gives the "wrong" results?I am aware of the scenario Kalen Delaney describes in this article. http://sqlblog.com/blogs/kalen_delaney/archive/2009/12/07/how-many-rows.aspxHowever, one thing that she does not mention is that if you were to do a count(*) within the SAME transaction as where you are inserting the rows, you will see that the count includes the uncommitted rows - see the example below:CREATE TABLE xyz (id INT);INSERT INTO xyz VALUES (1),(2);-- should return 2SELECT COUNT(*) FROM xyz;BEGIN TRANINSERT INTO xyz VALUES (3);-- should return 3SELECT COUNT(*) FROM xyz;ROLLBACK;-- should return 2SELECT COUNT(*) FROM xyz;DROP TABLE xyz; So what is accurate sort of depends on your view of the world and where you are looking at it from, doesn't it? If developers get "freaked out" over that, some education is in order, I guess. |
|
|
|