top of page

Foreign keys impact on inserts

In my previous post, I’ve demonstrated how Foreign keys can help the optimizer make better decisions when creating an execution plan. However, I’ve also mentioned that FKs make inserts and updates run slower. In this short post I’ll show the actual impact the FK has on these actions.

Why does the insert run slower ?

A foreign key is basically a constraint. It limits the data in the referencing column to a list of values found in the referenced table. This makes SQL Server check the data before inserting it, to make sure the constraint is not violated. This check takes time, but how much time ?

To demonstrate I’ve created 2 tables: NameList (populated with 100 Ids and names) and DataTable that will reference the NameList table.

At first I’ve created the DataTable without the FK.

Transact-SQL

-- Create the list table CREATE TABLE dbo.NameList ( Id INT NOT NULL , Name VARCHAR(10) NOT NULL , CONSTRAINT pk_NameId_c PRIMARY KEY CLUSTERED (Id) ); INSERT INTO dbo.NameList(Id, Name) SELECT TOP 100 ROW_NUMBER()OVER(ORDER BY T1.name) , 'Name_'+CAST(ROW_NUMBER()OVER(ORDER BY T1.name) AS VARCHAR(4)) FROM sys.all_columns AS T1 CROSS JOIN sys.all_columns AS T2; GO -- Create data table CREATE TABLE dbo.DataTable ( NameId INT NOT NULL ); GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

-- Create the list table

CREATE TABLE

dbo.NameList

(

Id INT NOT NULL ,

Name VARCHAR(10) NOT NULL ,

CONSTRAINT pk_NameId_c PRIMARY KEY CLUSTERED (Id)

);

INSERT INTO

dbo.NameList(Id, Name)

SELECT TOP 100

ROW_NUMBER()OVER(ORDER BY T1.name) ,

'Name_'+CAST(ROW_NUMBER()OVER(ORDER BY T1.name) AS VARCHAR(4))

FROM

sys.all_columns AS T1

CROSS JOIN

sys.all_columns AS T2;

GO

-- Create data table

CREATE TABLE

dbo.DataTable

(

NameId INT NOT NULL

);

GO

I’ve also created a third table (RandomData) from which data will be selected into the DataTable. This table contains 1,000,000 records that match the values in the list table.

Transact-SQL

-- Create random data SELECT TOP 1000000 ABS(CHECKSUM(NEWID())%100)+1 AS RandId INTO dbo.RandomData FROM sys.all_columns AS T1 CROSS JOIN sys.all_columns AS T2; GO

1

2

3

4

5

6

7

8

9

10

-- Create random data

SELECT TOP 1000000

ABS(CHECKSUM(NEWID())%100)+1 AS RandId

INTO

dbo.RandomData

FROM

sys.all_columns AS T1

CROSS JOIN

sys.all_columns AS T2;

GO

Inserting data without a FK:

First lets see how long it takes to insert 1,000,000 records into a table without a FK. To test this, I’ve run an insert statement 10 times and measured the AVG run time.

Transact-SQL

-- Insert with out a FK and check time IF EXISTS (SELECT * FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'dbo.fk_DataList') AND parent_object_id = OBJECT_ID(N'dbo.DataTable')) BEGIN ALTER TABLE dbo.DataTable DROP CONSTRAINT fk_DataList; END; GO DECLARE @StartTime DATETIME2 , @EndTime DATETIME2 , @i INT = 1, @n INT = 10; TRUNCATE TABLE dbo.DataTable; SET @StartTime = SYSDATETIME(); WHILE @i <= @n BEGIN INSERT INTO dbo.DataTable (NameId) SELECT RandId FROM dbo.RandomData; SET @i+=1; END; SET @EndTime = SYSDATETIME(); SELECT Avg_Insert_Time = CAST(CAST(DATEDIFF(MS,@StartTime,@EndTime) AS DECIMAL)/1000/@n AS DECIMAL(7,3)); GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

-- Insert with out a FK and check time

IF EXISTS (SELECT * FROM sys.foreign_keys

WHERE object_id = OBJECT_ID(N'dbo.fk_DataList')

AND parent_object_id = OBJECT_ID(N'dbo.DataTable'))

BEGIN

ALTER TABLE dbo.DataTable

DROP CONSTRAINT fk_DataList;

END;

GO

DECLARE

@StartTime DATETIME2 ,

@EndTime DATETIME2 ,

@i INT = 1,

@n INT = 10;

TRUNCATE TABLE dbo.DataTable;

SET @StartTime = SYSDATETIME();

WHILE @i <= @n

BEGIN

INSERT INTO dbo.DataTable (NameId)

SELECT RandId FROM dbo.RandomData;

SET @i+=1;

END;

SET @EndTime = SYSDATETIME();

SELECT Avg_Insert_Time = CAST(CAST(DATEDIFF(MS,@StartTime,@EndTime) AS DECIMAL)/1000/@n AS DECIMAL(7,3));

GO

I ran this test 12 times, and eliminated the best and worst results. The average run time was 4.68 sec.

The execution plan for the insert was very simple.


Inserting with a FK:

I run the same test, only this time I’ve created a FK before.

Transact-SQL

-- Add the FK TRUNCATE TABLE dbo.DataTable; ALTER TABLE dbo.DataTable ADD CONSTRAINT fk_DataList FOREIGN KEY (NameId) REFERENCES dbo.NameList (Id); GO -- Now insert with a FK and check time DECLARE @StartTime DATETIME2 , @EndTime DATETIME2 , @i INT = 1, @n INT = 10; TRUNCATE TABLE dbo.DataTable; SET @StartTime = SYSDATETIME(); WHILE @i <= @n BEGIN INSERT INTO dbo.DataTable (NameId) SELECT RandId FROM dbo.RandomData; SET @i+=1; END; SET @EndTime = SYSDATETIME(); SELECT Avg_Insert_Time = CAST(CAST(DATEDIFF(MS,@StartTime,@EndTime) AS DECIMAL)/1000/@n AS DECIMAL(7,3)); GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

-- Add the FK

TRUNCATE TABLE dbo.DataTable;

ALTER TABLE dbo.DataTable

ADD CONSTRAINT fk_DataList FOREIGN KEY (NameId)

REFERENCES dbo.NameList (Id);

GO

-- Now insert with a FK and check time

DECLARE

@StartTime DATETIME2 ,

@EndTime DATETIME2 ,

@i INT = 1,

@n INT = 10;

TRUNCATE TABLE dbo.DataTable;

SET @StartTime = SYSDATETIME();

WHILE @i <= @n

BEGIN

INSERT INTO dbo.DataTable (NameId)

SELECT RandId FROM dbo.RandomData;

SET @i+=1;

END;

SET @EndTime = SYSDATETIME();

SELECT Avg_Insert_Time = CAST(CAST(DATEDIFF(MS,@StartTime,@EndTime) AS DECIMAL)/1000/@n AS DECIMAL(7,3));

GO

This time the average run time was 6.30sec. Almost 35% more. I must say I didn’t think the impact is that significant.

The execution plan this time was very different. The data is sorted, and the list table is accessed as well.


Enjoying both worlds:

As we just saw, a FK can have significant impact on performance during inserts. Sometimes it’s a price we are willing to pay, but sometimes our system is overloaded as it is. What can we do to minimize the impact on inserts while still keeping the FK ?

One solution is to drop the FK right before the insert and then rebuild it. This way the insert itself is performed as fast as possible while we still keep our data consistent.

Transact-SQL

-- Dropping the FK before the insert and recrating it after it's finished DECLARE @StartTime DATETIME2 , @EndTime DATETIME2 , @i INT = 1, @n INT = 10; TRUNCATE TABLE dbo.DataTable; SET @StartTime = SYSDATETIME(); WHILE @i <= @n BEGIN ALTER TABLE dbo.DataTable DROP CONSTRAINT fk_DataList; INSERT INTO dbo.DataTable (NameId) SELECT RandId FROM dbo.RandomData; ALTER TABLE dbo.DataTable ADD CONSTRAINT fk_DataList FOREIGN KEY (NameId) REFERENCES dbo.NameList (Id); SET @i+=1; END; SET @EndTime = SYSDATETIME(); SELECT Avg_Insert_Time = CAST(CAST(DATEDIFF(MS,@StartTime,@EndTime) AS DECIMAL)/1000/@n AS DECIMAL(7,3)); GO

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

-- Dropping the FK before the insert and recrating it after it's finished

DECLARE

@StartTime DATETIME2 ,

@EndTime DATETIME2 ,

@i INT = 1,

@n INT = 10;

TRUNCATE TABLE dbo.DataTable;

SET @StartTime = SYSDATETIME();

WHILE @i <= @n

BEGIN

ALTER TABLE dbo.DataTable

DROP CONSTRAINT fk_DataList;

INSERT INTO dbo.DataTable (NameId)

SELECT RandId FROM dbo.RandomData;

ALTER TABLE dbo.DataTable

ADD CONSTRAINT fk_DataList FOREIGN KEY (NameId)

REFERENCES dbo.NameList (Id);

SET @i+=1;

END;

SET @EndTime = SYSDATETIME();

SELECT Avg_Insert_Time = CAST(CAST(DATEDIFF(MS,@StartTime,@EndTime) AS DECIMAL)/1000/@n AS DECIMAL(7,3));

GO

This time, the average run time was 5.03 sec, only 2.4% more than without using a FK. The only “danger” is this way there is a possibility that data that violates the FK can be inserted and we won’t be able to rebuild the FK. In order to avoid this we could simply use a try-catch and rollback in case to FK cannot be rebuilt.

0 comments

Comentários


STAY IN TOUCH

Get New posts delivered straight to your inbox

Thank you for subscribing!

bottom of page