A lot of times, when creating a procedure, we need to use some kind of an empty table in order to perform all kinds of calculations. The table is only needed for the calculation process itself, and has no other functionality otherwise. In most cases I choose to use a temporary table in order to do this, but I must admit that I’ve never given it much thought in terms of performance. It always seemed like the natural thing to do, but is it the best thing to do?
First, let consider our options:
Staging table (dbo.MyTable)
Temp table (#MyTable)
Table variable (@MyTable)
In order to find out which one is the best choice, I ran a little test. I used all 3 types of tables in order to execute the same kind of procedure and measured the time it took to execute. I did this for 2 kinds of procedures (simple and complex calculation) and for various record sets from 10 records to 1,000,000 records. I’ve also cleaned the cache in order to make the procedures recompile and see how it affected performance for every type of table.
The test:
I first created some test data to be used throughout the test. The tables held 1,000,000 records of random values from 1 to 10,000. The table was indexed using a clustered primary key.
Transact-SQL
-- Test data table CREATE TABLE dbo.DataTable ( Id INT NOT NULL IDENTITY (1,1) , Value INT NOT NULL , CONSTRAINT pk_DataTable PRIMARY KEY CLUSTERED (Id) ); -- 1,000,000 recordes of test data INSERT INTO dbo.DataTable ( Value ) SELECT TOP 1000000 ABS(CHECKSUM(NEWID())%10000) + 1 FROM sys.all_columns AS T1 CROSS JOIN sys.all_columns AS T2; GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Test data table
CREATE TABLE
dbo.DataTable
(
Id INT NOT NULL IDENTITY (1,1) ,
Value INT NOT NULL ,
CONSTRAINT pk_DataTable PRIMARY KEY CLUSTERED (Id)
);
-- 1,000,000 recordes of test data
INSERT INTO
dbo.DataTable
(
Value
)
SELECT TOP 1000000
ABS(CHECKSUM(NEWID())%10000) + 1
FROM
sys.all_columns AS T1
CROSS JOIN
sys.all_columns AS T2;
GO
I then created the Staging table. One column, no indexes, no constraints.
Transact-SQL
-- The Staging table CREATE TABLE dbo.StgTable ( Value INT NOT NULL ); GO
1
2
3
4
5
6
7
-- The Staging table
CREATE TABLE
dbo.StgTable
(
Value INT NOT NULL
);
GO
For the test itself I used three procedures, all doing exactly the same:
Populating the Staging / Temp / Variable table with a certain amount of records.
For the simple procedure – Count all of the values that are divided by 10.
For the complex procedure – Join with the DataTable and then count the data values that are divided by 10.
Staging Table:
Transact-SQL
CREATE PROCEDURE dbo.usp_StgTable @Top INT , @Complex BIT AS BEGIN -- Insert X amount of records into the table INSERT INTO dbo.StgTable ( Value ) SELECT Value FROM dbo.DataTable WHERE Id <= @Top; IF @Complex = 0 -- Simple BEGIN SELECT COUNT(*) FROM dbo.StgTable WHERE Value%10 = 0; END; ELSE -- Complex BEGIN SELECT COUNT(*) FROM dbo.StgTable AS Stg INNER JOIN dbo.DataTable AS Tbl ON Stg.Value = Tbl.Id WHERE Tbl.Value%10 = 0; END; END; GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
CREATE PROCEDURE
dbo.usp_StgTable
@Top INT ,
@Complex BIT
AS
BEGIN
-- Insert X amount of records into the table
INSERT INTO dbo.StgTable
(
Value
)
SELECT Value
FROM dbo.DataTable
WHERE Id <= @Top;
IF @Complex = 0 -- Simple
BEGIN
SELECT COUNT(*)
FROM dbo.StgTable
WHERE Value%10 = 0;
END;
ELSE -- Complex
BEGIN
SELECT COUNT(*)
FROM dbo.StgTable AS Stg
INNER JOIN dbo.DataTable AS Tbl
ON Stg.Value = Tbl.Id
WHERE Tbl.Value%10 = 0;
END;
END;
GO
Temp Table:
Transact-SQL
CREATE PROCEDURE dbo.usp_TmpTable @Top INT , @Complex BIT AS BEGIN -- Insert X amount of records into the table SELECT Value INTO #Table FROM dbo.DataTable WHERE Id <= @Top; IF @Complex = 0 -- Simple BEGIN SELECT COUNT(*) FROM #Table WHERE Value%10 = 0; END; ELSE -- Complex BEGIN SELECT COUNT(*) FROM #Table AS Tmp INNER JOIN dbo.DataTable AS Tbl ON Tmp.Value = Tbl.Id WHERE Tbl.Value%10 = 0; END; DROP TABLE #Table; END; GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
CREATE PROCEDURE
dbo.usp_TmpTable
@Top INT ,
@Complex BIT
AS
BEGIN
-- Insert X amount of records into the table
SELECT Value
INTO #Table
FROM dbo.DataTable
WHERE Id <= @Top;
IF @Complex = 0 -- Simple
BEGIN
SELECT COUNT(*)
FROM #Table
WHERE Value%10 = 0;
END;
ELSE -- Complex
BEGIN
SELECT COUNT(*)
FROM #Table AS Tmp
INNER JOIN dbo.DataTable AS Tbl
ON Tmp.Value = Tbl.Id
WHERE Tbl.Value%10 = 0;
END;
DROP TABLE #Table;
END;
GO
Table Variable:
Transact-SQL
CREATE PROCEDURE dbo.usp_VarTable @Top INT , @Complex BIT AS BEGIN -- Insert X amount of records into the table DECLARE @Table TABLE ( Value INT NOT NULL ); INSERT INTO @Table ( Value ) SELECT Value FROM dbo.DataTable WHERE Id <= @Top; IF @Complex = 0 -- Simple BEGIN SELECT COUNT(*) FROM @Table WHERE Value%10 = 0; END; ELSE -- Complex BEGIN SELECT COUNT(*) FROM @Table AS Var INNER JOIN dbo.DataTable AS Tbl ON Var.Value = Tbl.Id WHERE Tbl.Value%10 = 0; END; END; GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
CREATE PROCEDURE
dbo.usp_VarTable
@Top INT ,
@Complex BIT
AS
BEGIN
-- Insert X amount of records into the table
DECLARE @Table TABLE
(
Value INT NOT NULL
);
INSERT INTO @Table
(
Value
)
SELECT Value
FROM dbo.DataTable
WHERE Id <= @Top;
IF @Complex = 0 -- Simple
BEGIN
SELECT COUNT(*)
FROM @Table
WHERE Value%10 = 0;
END;
ELSE -- Complex
BEGIN
SELECT COUNT(*)
FROM @Table AS Var
INNER JOIN dbo.DataTable AS Tbl
ON Var.Value = Tbl.Id
WHERE Tbl.Value%10 = 0;
END;
END;
GO
I used the following scripts to run the test and to create the table where the results were saved.
Transact-SQL
CREATE TABLE dbo.TableTypeTest ( TableType VARCHAR(10) NOT NULL , Complex BIT NOT NULL , RowCnt INT NOT NULL , UnCompiledRun INT NOT NULL , CompiledRun INT NOT NULL ); GO
1
2
3
4
5
6
7
8
9
10
CREATE TABLE
dbo.TableTypeTest
(
TableType VARCHAR(10) NOT NULL ,
Complex BIT NOT NULL ,
RowCnt INT NOT NULL ,
UnCompiledRun INT NOT NULL ,
CompiledRun INT NOT NULL
);
GO
The script was run twice for every table type, once for simple and the other for complex (six times in all). The script calculates the average time it took for the procedure to execute (average over 20 runs per procedure). This was done twice every cycle, once when the cache is freed before every run to make the procedure recompile every time, and the other when the execution plan is in cache.
The number of records populating the table was multiplied by 10 after every cycle (10, 100, 1000,…, 1000000) .
Transact-SQL
DBCC FREEPROCCACHE; DECLARE @Top INT = 10 , @Complex BIT = 0 , -- 1 @Type VARCHAR(10) = 'Stg' , -- 'Tmp', 'Var' @SQL NVARCHAR(MAX); WHILE @Top <= 1000000 BEGIN SET @SQL = ' DECLARE @i INT , @n INT = 20 , @Time0 DATETIME2 , @Time1 DATETIME2 , @UnCompTime INT = 0 , @CompTime INT = 0; SET @i = 1; WHILE @i <= @n BEGIN TRUNCATE TABLE dbo.StgTable; DBCC FREEPROCCACHE; SET @Time0 = SYSDATETIME(); EXECUTE dbo.usp_'+@Type+'Table '+CAST(@Top AS VARCHAR(10))+' , '+CAST(@Complex AS VARCHAR(10))+'; SET @Time1 = SYSDATETIME(); SET @UnCompTime += DATEDIFF(ms,@Time0,@Time1) SET @i += 1; END; SET @i = 1; WHILE @i <= @n BEGIN TRUNCATE TABLE dbo.StgTable; SET @Time0 = SYSDATETIME(); EXECUTE dbo.usp_'+@Type+'Table '+CAST(@Top AS VARCHAR(10))+' , '+CAST(@Complex AS VARCHAR(10))+'; SET @Time1 = SYSDATETIME(); SET @CompTime += DATEDIFF(ms,@Time0,@Time1) SET @i += 1; END; INSERT INTO dbo.TableTypeTest ( TableType , Complex , RowCnt , UnCompiledRun , PreCompiledRun ) SELECT TableType = '''+@Type+''' , Complex = '+CAST(@Complex AS VARCHAR(10))+' , RowCnt = '+CAST(@Top AS VARCHAR(10))+' , UnCompiledRun = @UnCompTime/@n , PreCompiledRun = @CompTime/@n;' EXECUTE sp_executesql @SQL; SET @Top *= 10; END; GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
DBCC FREEPROCCACHE;
DECLARE
@Top INT = 10 ,
@Complex BIT = 0 , -- 1
@Type VARCHAR(10) = 'Stg' , -- 'Tmp', 'Var'
@SQL NVARCHAR(MAX);
WHILE @Top <= 1000000
BEGIN
SET @SQL = '
DECLARE
@i INT ,
@n INT = 20 ,
@Time0 DATETIME2 ,
@Time1 DATETIME2 ,
@UnCompTime INT = 0 ,
@CompTime INT = 0;
SET @i = 1;
WHILE @i <= @n
BEGIN
TRUNCATE TABLE dbo.StgTable;
DBCC FREEPROCCACHE;
SET @Time0 = SYSDATETIME();
EXECUTE dbo.usp_'+@Type+'Table
'+CAST(@Top AS VARCHAR(10))+' ,
'+CAST(@Complex AS VARCHAR(10))+';
SET @Time1 = SYSDATETIME();
SET @UnCompTime += DATEDIFF(ms,@Time0,@Time1)
SET @i += 1;
END;
SET @i = 1;
WHILE @i <= @n
BEGIN
TRUNCATE TABLE dbo.StgTable;
SET @Time0 = SYSDATETIME();
EXECUTE dbo.usp_'+@Type+'Table
'+CAST(@Top AS VARCHAR(10))+' ,
'+CAST(@Complex AS VARCHAR(10))+';
SET @Time1 = SYSDATETIME();
SET @CompTime += DATEDIFF(ms,@Time0,@Time1)
SET @i += 1;
END;
INSERT INTO
dbo.TableTypeTest
(
TableType ,
Complex ,
RowCnt ,
UnCompiledRun ,
PreCompiledRun
)
SELECT
TableType = '''+@Type+''' ,
Complex = '+CAST(@Complex AS VARCHAR(10))+' ,
RowCnt = '+CAST(@Top AS VARCHAR(10))+' ,
UnCompiledRun = @UnCompTime/@n ,
PreCompiledRun = @CompTime/@n;'
EXECUTE sp_executesql @SQL;
SET @Top *= 10;
END;
GO
The Results:
The “simple” procedure produced the following results (run time is shown in ms):
It’s clear that the variable table is the “winner” for the UnCompiled runs. It’s about twice to three times as fast as the Staging and Temp table. For the PreCompiled runs the Var table has a slight advantage over the Temp table, but it’s still about twice to three times as fast as the Staging table.
The “complex” procedure produced the following results:
Now the winner depends on the type of procedure (simple or complex).
For the UnCompiled runs the Var table is still 1.5-2 times faster than the Staging and Temp tables. But for the PreCompiled runs the Temp table takes the lead and is about 3 times as fast as the Var and Staging tables.
So what’s going on?
Let’s have a closer look at the results and try to find out.
We’ll start with the simple procedure.
In this procedure we count how many values are divided by 10. The execution plan for this kind of query would be the same (Table scan) no matter how many records the table holds.
Since the plan is always the same, statistics have no meaning in this case and updating it would be a useless overhead.
This is where the table variable has an advantage over other kinds of tables. It has no statistics, and the overhead for this kind of execution is kept minimal. In addition, SQL Server cannot make an estimated plan to the Temp table procedure, but is able to do so for the Var table. It doesn’t have to “wait” for the statistics to be updated for the Var table (there will never be any), but it needs to do so for the Temp table. When a Temp table is part of the procedure, the plan is recompiled every time the procedure is executed. Again, a useless overhead.
Now for the more complex procedure.
Here we join our table with another big table (1,000,000 records), before the count is performed. Now statistics have a huge influence on the execution plan. For a small amount of records, SQL Server chooses to perform the join using the “Nested Loops” method. However, for a large record count it chooses the “Hash Match” method.
When there are no statistics (Var table) SQL Server is “guessing” when it comes to choosing the right execution plan. For lower record counts, it guesses correctly having expecting only 1 record to return. For higher record counts its guess is becoming more and more incorrect.
This is why the Temp table procedure is more or less the same as the Var table procedure up to 100,000 records, but is much more efficient for higher record counts.
Here are the execution plans for 100,000 records:
Temp table:
Variable table:
This is true for both UnCompiled and PreCompiled runs, but the overhead of recompiling is still great for the Staging and Temp tables.
Conclusion
So… which is best?
The Staging table is clearly not the best choice. Having to keep statistics updated all the time is too big of an overhead for every kind of record count and every kind of procedure. In addition, there is the cost of truncating the table before (or after) every calculation. This was not measured in my little test to make the procedure as identical as possible. Naturally, this will make using a Staging table even more inefficient.
The Var table seems to be the best choice for simple calculation when there is only one way to execute the procedure. It is also the best choice when for some reason, there is no execution plan in the cache, but this is a rare event for most procedures.
The Temp table is the best choice when the procedure can be executed in more than one way. Here, updating the statistics is an overhead worth paying.
Bottom Line
As a “rule of thumb” I would use the Temp table. Usually we already have the plan in cache, so we don’t have to worry about recompilation. The advantage the Var table has over the Temp table when it comes to simple calculations is minor compared to the huge advantage the Temp table has for complex ones.
Only if you are sure SQL will choose the right execution plan, use the Var table. In any other case a Temp table is preferable.
The Catch
When using a table variable inside a transaction, its behavior is different than a regular or a temporary table.
Curious?? Wait for my next post…
Comments