We are opening another class of the “Advanced Programming in SQL Server” course on 28/08. You can find all the details here.
Here is an interesting “advanced programming” puzzle. The first one to solve the puzzle will win a free ticket to the course.
Are you ready? Let’s go…
Let’s first create a table and populate some data in it:
Transact-SQL
CREATE SCHEMA Sales; GO CREATE TABLE Sales.Orders ( Id INT NOT NULL IDENTITY(1,1) , DateAndTime DATETIME2(7) NOT NULL , OrderStatusId TINYINT NOT NULL , CONSTRAINT pk_Orders_nc_Id PRIMARY KEY NONCLUSTERED (Id ASC) ) GO CREATE NONCLUSTERED INDEX ix_Orders_nc_nu_DateAndTime ON Sales.Orders (DateAndTime ASC); GO INSERT INTO Sales.Orders WITH (TABLOCK) ( DateAndTime, OrderStatusId ) SELECT TOP (10000) DateAndTime = DATEADD (MINUTE , - ABS (CHECKSUM (NEWID ())) % (60 * 24 * 365 * 5) , SYSDATETIME ()), OrderStatusId = ABS (CHECKSUM (NEWID ())) % 8 + 1 FROM sys.all_columns t1 CROSS JOIN sys.all_columns t2 ORDER BY NEWID () ASC; GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
CREATE SCHEMA
Sales;
GO
CREATE TABLE
Sales.Orders
(
Id INT NOT NULL IDENTITY(1,1) ,
DateAndTime DATETIME2(7) NOT NULL ,
OrderStatusId TINYINT NOT NULL ,
CONSTRAINT
pk_Orders_nc_Id
PRIMARY KEY NONCLUSTERED
(Id ASC)
)
GO
CREATE NONCLUSTERED INDEX
ix_Orders_nc_nu_DateAndTime
ON
Sales.Orders (DateAndTime ASC);
GO
INSERT INTO
Sales.Orders WITH (TABLOCK)
(
DateAndTime,
OrderStatusId
)
SELECT TOP (10000)
DateAndTime = DATEADD (MINUTE , - ABS (CHECKSUM (NEWID ())) % (60 * 24 * 365 * 5) , SYSDATETIME ()),
OrderStatusId = ABS (CHECKSUM (NEWID ())) % 8 + 1
FROM
sys.all_columns t1
CROSS JOIN
sys.all_columns t2
ORDER BY
NEWID () ASC;
GO
Let’s do a very common task as returning a subset of the data filtered according to the date. In these 2 examples I selected all the rows from the past 2 years. The only difference between the two SELECT statements is that the first uses SYSDATETIME whereas the second uses GETDATE.
Transact-SQL
SELECT Orders.Id , Orders.DateAndTime , OrderStatusId FROM Sales.Orders AS Orders WHERE Orders.DateAndTime > DATEADD (YEAR , -2 , SYSDATETIME ()); GO SELECT Orders.Id , Orders.DateAndTime , OrderStatusId FROM Sales.Orders AS Orders WHERE Orders.DateAndTime > DATEADD (YEAR , -2 , GETDATE ()); GO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SELECT
Orders.Id ,
Orders.DateAndTime ,
OrderStatusId
FROM
Sales.Orders AS Orders
WHERE
Orders.DateAndTime > DATEADD (YEAR , -2 , SYSDATETIME ());
GO
SELECT
Orders.Id ,
Orders.DateAndTime ,
OrderStatusId
FROM
Sales.Orders AS Orders
WHERE
Orders.DateAndTime > DATEADD (YEAR , -2 , GETDATE ());
GO
Now check out the actual execution plan for these two statements. The optimizer estimates the first query to return only one row:
While in the second query, the optimizer has no problem estimating a very close amount to the actual number of rows:
The question is: why the hell is there a difference between the estimated number of rows in the two statements?
You are invited to send me your answers either by replying to this post or by emailing to (noam AT madeira.co.il). The first one to send the correct answer will win a free ticket to our Advanced Programming in SQL Server course. In the course we are going to solve this puzzle as well as many other interesting programming techniques.
Good luck!
Comments