Madeira Team

Aug 19, 2012

In Case You Really Have To Use Cursors

Cursors (and partially while loops) have a really bad reputation in the SQL Server world.

Basically, there are three good reasons that justify dealing with cursors:

  1. You are faced with one of the (very) rare problems where cursors are the best solution *

  2. You are drugged into debate about what’s better: a cursor or a while loop

  3. The code is already written and you don’t feel like changing it

* Starting from SQL Server 2012, even this problem can be solved better with a set-based solution using the new window functions.

Here’s a cool trick I learnt from Kevin Cox, a member of the SQLCat team while he was demonstrating how to trace the system using XEvents:

To illustrate it, we’ll use the running totals problem (the problem from bullet #1).

First, let’s create a two tables, and populate the first one:

[code lang=”sql”]
 

 
CREATE TABLE myTable1(id int identity primary key, number bigint)
 

 
CREATE TABLE myTable2(id int identity primary key, sumUpToId bigint)
 

 
go

INSERT INTO myTable1(number)
 

 
SELECT TOP 100000 row_number () over(order by o1.object_id)
 

 
FROM master.sys.objects o1
 

 
CROSS JOIN master.sys.objects o2
 

 
CROSS JOIN master.sys.objects o3

[/code]

Now examine the two following batches:

First one:

[code lang=”sql”]
 

 
TRUNCATE TABLE myTable2
 

 
DECLARE @id int
 

 
SET @id=1
 

 
DECLARE @sum bigint
 

 
SET @sum=0
 

 
WHILE @id<=100000
 

 
BEGIN
 

 
SELECT @sum = @sum + number FROM myTable1 WHERE id = @id
 

 
INSERT myTable2(sumUpToId) SELECT @sum
 

 
SET @id = @id + 1
 

 
END
 

 
[/code]

Second one:

[code lang=”sql”]
 

 
TRUNCATE TABLE myTable2
 

 
BEGIN TRANSACTION
 

 
DECLARE @id int
 

 
SET @id=1
 

 
DECLARE @sum bigint
 

 
SET @sum=0
 

 
WHILE @id <= 100000
 

 
BEGIN
 

 
SELECT @sum = @sum + number FROM myTable1 WHERE id = @id
 

 
INSERT myTable2(sumUpToId) SELECT @sum
 

 
SET @id = @id + 1
 

 
END
 

 
COMMIT
 

 
[/code]

On my machine, the first batch took 21 seconds to execute, while the second one took only 2 second!

What is the explanation for the performance difference?

When not working with an explicit transaction, SQL Server works in autocommit mode, which means a transaction is automatically opened for each one of the inserts in the loop. For each of these transactions, SQL Server has to check whether to commit or roll it back, which adds a-lot of overhead.

On the other hand, when we open an explicit transaction, SQL Server knows we are in charge of handling the transaction, a fact that allows it to work much faster.

Amit Banerjee checked this issue, and he saw that there were big differences in the number of log flushes, the amount of log bytes flushed, and in the number of log waits between the two batches. Also, Kendra Little presented this and other interesting transaction log issues in this webcast.

Before we get to cursors, here’s another small tweak:

In order to prevent lock escalation, we will commit the transaction every 5000 iterations, and open a new one immediately:

[code lang=”sql”]
 

 
TRUNCATE TABLE myTable2
 

 
BEGIN TRANSACTION
 

 
DECLARE @id int
 

 
SET @id=1
 

 
DECLARE @sum bigint
 

 
SET @sum=0
 

 
WHILE @id <= 100000
 

 
BEGIN
 

 
SELECT @sum = @sum + number FROM myTable1 WHERE id = @id
 

 
INSERT myTable2(sumUpToId) SELECT @sum
 

 
IF @id % 5000 = 0
 

 
BEGIN
 

 
COMMIT TRANSACTION
 

 
BEGIN TRANSACTION
 

 
END
 

 
SET @id = @id + 1
 

 
END
 

 
COMMIT
 

 
[/code]

That way we increase the concurrency while keeping the performance boost we got.

So does it work with cursors?

Let’s check!

First batch, without an explicit transaction:

[code lang=”sql”]

TRUNCATE TABLE myTable2
 

 
DECLARE @currentNumber bigint
 

 
DECLARE myCursor cursor forward_only read_only
 

 
for SELECT number FROM myTable1
 

 
DECLARE @sum bigint
 

 
SET @sum=0
 

 
open myCursor
 

 
fetch next FROM myCursor INTO @currentNumber
 

 
WHILE @@fetch_status=0
 

 
BEGIN
 

 
SELECT @sum = @sum + @currentNumber
 

 
INSERT myTable2(sumUpToId) SELECT @sum
 

 
fetch next FROM myCursor INTO @currentNumber
 

 
END

CLOSE myCursor
 

 
DEALLOCATE myCursor

[/code]

Execution time: 24 seconds

Second batch, with an explicit transaction:

[code lang=”sql”]

BEGIN TRANSACTION
 

 
TRUNCATE TABLE myTable2
 

 
DECLARE @currentNumber bigint
 

 
DECLARE @rowsProcessed int
 

 
SET @rowsProcessed = 0
 

 
DECLARE myCursor cursor forward_only read_only
 

 
for SELECT number FROM myTable1
 

 
DECLARE @sum bigint
 

 
SET @sum=0
 

 
open myCursor
 

 
fetch next FROM myCursor INTO @currentNumber
 

 
WHILE @@fetch_status=0
 

 
BEGIN
 

 
SELECT @sum = @sum + @currentNumber
 

 
INSERT myTable2(sumUpToId) SELECT @sum
 

 
SET @rowsProcessed = @rowsProcessed + 1
 

 
IF @rowsProcessed % 5000 = 0
 

 
BEGIN
 

 
COMMIT TRANSACTION
 

 
BEGIN TRANSACTION
 

 
END
 

 
fetch next FROM myCursor INTO @currentNumber
 

 
END

CLOSE myCursor
 

 
DEALLOCATE myCursor
 

 
COMMIT

[/code]

Execution time: 3 seconds!

Remember this trick the next time you see a loop or a cursor.

#cursors #performance #While #transactionlog

    0