top of page

Staging, temp and variable tables – performance, recompilation and what’s in between

A lot of times, when creating a procedure, we need to use some kind of an empty table in order to perform all kinds of calculations. The table is only needed for the calculation process itself, and has no other functionality otherwise. In most cases I choose to use a temporary table in order to do this, but I must admit that I’ve never given it much thought in terms of performance. It always seemed like the natural thing to do, but is it the best thing to do?

First, let consider our options:

  1. Staging table (dbo.MyTable)

  2. Temp table (#MyTable)

  3. Table variable (@MyTable)

In order to find out which one is the best choice, I ran a little test. I used all 3 types of tables in order to execute the same kind of procedure and measured the time it took to execute. I did this for 2 kinds of procedures (simple and complex calculation) and for various record sets from 10 records to 1,000,000 records. I’ve also cleaned the cache in order to make the procedures recompile and see how it affected performance for every type of table.

The test:

I first created some test data to be used throughout the test. The tables held 1,000,000 records of random values from 1 to 10,000. The table was indexed using a clustered primary key.

Transact-SQL

-- Test data table CREATE TABLE dbo.DataTable ( Id INT NOT NULL IDENTITY (1,1) , Value INT NOT NULL , CONSTRAINT pk_DataTable PRIMARY KEY CLUSTERED (Id) ); -- 1,000,000 recordes of test data INSERT INTO dbo.DataTable ( Value ) SELECT TOP 1000000 ABS(CHECKSUM(NEWID())%10000) + 1 FROM sys.all_columns AS T1 CROSS JOIN sys.all_columns AS T2; GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

-- Test data table

CREATE TABLE

dbo.DataTable

(

Id INT NOT NULL IDENTITY (1,1) ,

Value INT NOT NULL ,

CONSTRAINT pk_DataTable PRIMARY KEY CLUSTERED (Id)

);

-- 1,000,000 recordes of test data

INSERT INTO

dbo.DataTable

(

Value

)

SELECT TOP 1000000

ABS(CHECKSUM(NEWID())%10000) + 1

FROM

sys.all_columns AS T1

CROSS JOIN

sys.all_columns AS T2;

GO

I then created the Staging table. One column, no indexes, no constraints.

Transact-SQL

-- The Staging table CREATE TABLE dbo.StgTable ( Value INT NOT NULL ); GO

1

2

3

4

5

6

7

-- The Staging table

CREATE TABLE

dbo.StgTable

(

Value INT NOT NULL

);

GO

For the test itself I used three procedures, all doing exactly the same:

  1. Populating the Staging / Temp / Variable table with a certain amount of records.

  2. For the simple procedure – Count all of the values that are divided by 10.

  3. For the complex procedure – Join with the DataTable and then count the data values that are divided by 10.

Staging Table:

Transact-SQL

CREATE PROCEDURE dbo.usp_StgTable @Top INT , @Complex BIT AS BEGIN -- Insert X amount of records into the table INSERT INTO dbo.StgTable ( Value ) SELECT Value FROM dbo.DataTable WHERE Id <= @Top; IF @Complex = 0 -- Simple BEGIN SELECT COUNT(*) FROM dbo.StgTable WHERE Value%10 = 0; END; ELSE -- Complex BEGIN SELECT COUNT(*) FROM dbo.StgTable AS Stg INNER JOIN dbo.DataTable AS Tbl ON Stg.Value = Tbl.Id WHERE Tbl.Value%10 = 0; END; END; GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

CREATE PROCEDURE

dbo.usp_StgTable

@Top INT ,

@Complex BIT

AS

BEGIN

-- Insert X amount of records into the table

INSERT INTO dbo.StgTable

(

Value

)

SELECT Value

FROM dbo.DataTable

WHERE Id <= @Top;

IF @Complex = 0 -- Simple

BEGIN

SELECT COUNT(*)

FROM dbo.StgTable

WHERE Value%10 = 0;

END;

ELSE -- Complex

BEGIN

SELECT COUNT(*)

FROM dbo.StgTable AS Stg

INNER JOIN dbo.DataTable AS Tbl

ON Stg.Value = Tbl.Id

WHERE Tbl.Value%10 = 0;

END;

END;

GO

Temp Table:

Transact-SQL

CREATE PROCEDURE dbo.usp_TmpTable @Top INT , @Complex BIT AS BEGIN -- Insert X amount of records into the table SELECT Value INTO #Table FROM dbo.DataTable WHERE Id <= @Top; IF @Complex = 0 -- Simple BEGIN SELECT COUNT(*) FROM #Table WHERE Value%10 = 0; END; ELSE -- Complex BEGIN SELECT COUNT(*) FROM #Table AS Tmp INNER JOIN dbo.DataTable AS Tbl ON Tmp.Value = Tbl.Id WHERE Tbl.Value%10 = 0; END; DROP TABLE #Table; END; GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

CREATE PROCEDURE

dbo.usp_TmpTable

@Top INT ,

@Complex BIT

AS

BEGIN

-- Insert X amount of records into the table

SELECT Value

INTO #Table

FROM dbo.DataTable

WHERE Id <= @Top;

IF @Complex = 0 -- Simple

BEGIN

SELECT COUNT(*)

FROM #Table

WHERE Value%10 = 0;

END;

ELSE -- Complex

BEGIN

SELECT COUNT(*)

FROM #Table AS Tmp

INNER JOIN dbo.DataTable AS Tbl

ON Tmp.Value = Tbl.Id

WHERE Tbl.Value%10 = 0;

END;

DROP TABLE #Table;

END;

GO

Table Variable:

Transact-SQL

CREATE PROCEDURE dbo.usp_VarTable @Top INT , @Complex BIT AS BEGIN -- Insert X amount of records into the table DECLARE @Table TABLE ( Value INT NOT NULL ); INSERT INTO @Table ( Value ) SELECT Value FROM dbo.DataTable WHERE Id <= @Top; IF @Complex = 0 -- Simple BEGIN SELECT COUNT(*) FROM @Table WHERE Value%10 = 0; END; ELSE -- Complex BEGIN SELECT COUNT(*) FROM @Table AS Var INNER JOIN dbo.DataTable AS Tbl ON Var.Value = Tbl.Id WHERE Tbl.Value%10 = 0; END; END; GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

CREATE PROCEDURE

dbo.usp_VarTable

@Top INT ,

@Complex BIT

AS

BEGIN

-- Insert X amount of records into the table

DECLARE @Table TABLE

(

Value INT NOT NULL

);

INSERT INTO @Table

(

Value

)

SELECT Value

FROM dbo.DataTable

WHERE Id <= @Top;

IF @Complex = 0 -- Simple

BEGIN

SELECT COUNT(*)

FROM @Table

WHERE Value%10 = 0;

END;

ELSE -- Complex

BEGIN

SELECT COUNT(*)

FROM @Table AS Var

INNER JOIN dbo.DataTable AS Tbl

ON Var.Value = Tbl.Id

WHERE Tbl.Value%10 = 0;

END;

END;

GO

I used the following scripts to run the test and to create the table where the results were saved.

Transact-SQL

CREATE TABLE dbo.TableTypeTest ( TableType VARCHAR(10) NOT NULL , Complex BIT NOT NULL , RowCnt INT NOT NULL , UnCompiledRun INT NOT NULL , CompiledRun INT NOT NULL ); GO

1

2

3

4

5

6

7

8

9

10

CREATE TABLE

dbo.TableTypeTest

(

TableType VARCHAR(10) NOT NULL ,

Complex BIT NOT NULL ,

RowCnt INT NOT NULL ,

UnCompiledRun INT NOT NULL ,

CompiledRun INT NOT NULL

);

GO

The script was run twice for every table type, once for simple and the other for complex (six times in all). The script calculates the average time it took for the procedure to execute (average over 20 runs per procedure). This was done twice every cycle, once when the cache is freed before every run to make the procedure recompile every time, and the other when the execution plan is in cache.

The number of records populating the table was multiplied by 10 after every cycle (10, 100, 1000,…, 1000000) .

Transact-SQL

DBCC FREEPROCCACHE; DECLARE @Top INT = 10 , @Complex BIT = 0 , -- 1 @Type VARCHAR(10) = 'Stg' , -- 'Tmp', 'Var' @SQL NVARCHAR(MAX); WHILE @Top <= 1000000 BEGIN SET @SQL = ' DECLARE @i INT , @n INT = 20 , @Time0 DATETIME2 , @Time1 DATETIME2 , @UnCompTime INT = 0 , @CompTime INT = 0; SET @i = 1; WHILE @i <= @n BEGIN TRUNCATE TABLE dbo.StgTable; DBCC FREEPROCCACHE; SET @Time0 = SYSDATETIME(); EXECUTE dbo.usp_'+@Type+'Table '+CAST(@Top AS VARCHAR(10))+' , '+CAST(@Complex AS VARCHAR(10))+'; SET @Time1 = SYSDATETIME(); SET @UnCompTime += DATEDIFF(ms,@Time0,@Time1) SET @i += 1; END; SET @i = 1; WHILE @i <= @n BEGIN TRUNCATE TABLE dbo.StgTable; SET @Time0 = SYSDATETIME(); EXECUTE dbo.usp_'+@Type+'Table '+CAST(@Top AS VARCHAR(10))+' , '+CAST(@Complex AS VARCHAR(10))+'; SET @Time1 = SYSDATETIME(); SET @CompTime += DATEDIFF(ms,@Time0,@Time1) SET @i += 1; END; INSERT INTO dbo.TableTypeTest ( TableType , Complex , RowCnt , UnCompiledRun , PreCompiledRun ) SELECT TableType = '''+@Type+''' , Complex = '+CAST(@Complex AS VARCHAR(10))+' , RowCnt = '+CAST(@Top AS VARCHAR(10))+' , UnCompiledRun = @UnCompTime/@n , PreCompiledRun = @CompTime/@n;' EXECUTE sp_executesql @SQL; SET @Top *= 10; END; GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

DBCC FREEPROCCACHE;

DECLARE

@Top INT = 10 ,

@Complex BIT = 0 , -- 1

@Type VARCHAR(10) = 'Stg' , -- 'Tmp', 'Var'

@SQL NVARCHAR(MAX);

WHILE @Top <= 1000000

BEGIN

SET @SQL = '

DECLARE

@i INT ,

@n INT = 20 ,

@Time0 DATETIME2 ,

@Time1 DATETIME2 ,

@UnCompTime INT = 0 ,

@CompTime INT = 0;

SET @i = 1;

WHILE @i <= @n

BEGIN

TRUNCATE TABLE dbo.StgTable;

DBCC FREEPROCCACHE;

SET @Time0 = SYSDATETIME();

EXECUTE dbo.usp_'+@Type+'Table

'+CAST(@Top AS VARCHAR(10))+' ,

'+CAST(@Complex AS VARCHAR(10))+';

SET @Time1 = SYSDATETIME();

SET @UnCompTime += DATEDIFF(ms,@Time0,@Time1)

SET @i += 1;

END;

SET @i = 1;

WHILE @i <= @n

BEGIN

TRUNCATE TABLE dbo.StgTable;

SET @Time0 = SYSDATETIME();

EXECUTE dbo.usp_'+@Type+'Table

'+CAST(@Top AS VARCHAR(10))+' ,

'+CAST(@Complex AS VARCHAR(10))+';

SET @Time1 = SYSDATETIME();

SET @CompTime += DATEDIFF(ms,@Time0,@Time1)

SET @i += 1;

END;

INSERT INTO

dbo.TableTypeTest

(

TableType ,

Complex ,

RowCnt ,

UnCompiledRun ,

PreCompiledRun

)

SELECT

TableType = '''+@Type+''' ,

Complex = '+CAST(@Complex AS VARCHAR(10))+' ,

RowCnt = '+CAST(@Top AS VARCHAR(10))+' ,

UnCompiledRun = @UnCompTime/@n ,

PreCompiledRun = @CompTime/@n;'

EXECUTE sp_executesql @SQL;

SET @Top *= 10;

END;

GO

The Results:

The “simple” procedure produced the following results (run time is shown in ms):


It’s clear that the variable table is the “winner” for the UnCompiled runs. It’s about twice to three times as fast as the Staging and Temp table. For the PreCompiled runs the Var table has a slight advantage over the Temp table, but it’s still about twice to three times as fast as the Staging table.

The “complex” procedure produced the following results:


Now the winner depends on the type of procedure (simple or complex).

For the UnCompiled runs the Var table is still 1.5-2 times faster than the Staging and Temp tables. But for the PreCompiled runs the Temp table takes the lead and is about 3 times as fast as the Var and Staging tables.

So what’s going on?

Let’s have a closer look at the results and try to find out.

We’ll start with the simple procedure.

In this procedure we count how many values are divided by 10. The execution plan for this kind of query would be the same (Table scan) no matter how many records the table holds.


Since the plan is always the same, statistics have no meaning in this case and updating it would be a useless overhead.

This is where the table variable has an advantage over other kinds of tables. It has no statistics, and the overhead for this kind of execution is kept minimal. In addition, SQL Server cannot make an estimated plan to the Temp table procedure, but is able to do so for the Var table. It doesn’t have to “wait” for the statistics to be updated for the Var table (there will never be any), but it needs to do so for the Temp table. When a Temp table is part of the procedure, the plan is recompiled every time the procedure is executed. Again, a useless overhead.

Now for the more complex procedure.

Here we join our table with another big table (1,000,000 records), before the count is performed. Now statistics have a huge influence on the execution plan. For a small amount of records, SQL Server chooses to perform the join using the “Nested Loops” method. However, for a large record count it chooses the “Hash Match” method.

When there are no statistics (Var table) SQL Server is “guessing” when it comes to choosing the right execution plan. For lower record counts, it guesses correctly having expecting only 1 record to return. For higher record counts its guess is becoming more and more incorrect.

This is why the Temp table procedure is more or less the same as the Var table procedure up to 100,000 records, but is much more efficient for higher record counts.

Here are the execution plans for 100,000 records:

Temp table:


Variable table:


This is true for both UnCompiled and PreCompiled runs, but the overhead of recompiling is still great for the Staging and Temp tables.

Conclusion

So… which is best?

The Staging table is clearly not the best choice. Having to keep statistics updated all the time is too big of an overhead for every kind of record count and every kind of procedure. In addition, there is the cost of truncating the table before (or after) every calculation. This was not measured in my little test to make the procedure as identical as possible. Naturally, this will make using a Staging table even more inefficient.

The Var table seems to be the best choice for simple calculation when there is only one way to execute the procedure. It is also the best choice when for some reason, there is no execution plan in the cache, but this is a rare event for most procedures.

The Temp table is the best choice when the procedure can be executed in more than one way. Here, updating the statistics is an overhead worth paying.

Bottom Line

As a “rule of thumb” I would use the Temp table. Usually we already have the plan in cache, so we don’t have to worry about recompilation. The advantage the Var table has over the Temp table when it comes to simple calculations is minor compared to the huge advantage the Temp table has for complex ones.

Only if you are sure SQL will choose the right execution plan, use the Var table. In any other case a Temp table is preferable.

The Catch

When using a table variable inside a transaction, its behavior is different than a regular or a temporary table.

Curious?? Wait for my next post…

0 comments

Comments


STAY IN TOUCH

Get New posts delivered straight to your inbox

Thank you for subscribing!

bottom of page