Earlier this week, I delivered a presentation called “Query Progress Tracking in SQL Server” to my local user group.
The issue is that using this techniques, we have to jump back and forth between the execution plan and the information we have in the DMVs. What if we had a visual indication of where things are stuck on top of the execution plan?
Brent Ozar thankfully gave a shout out about Stackoverflow’s Data Explorer, which allows us to query their database. In addition to getting the data back, we can also see the execution plan of the query. Here’s how it looks:
StackExchange Execution Plan
And so I did. I wanted a way to mark the relevant nodes of a running query that currently incur a CXPacket wait, which in most cases implies that their underlying subtrees are currently being processed. This would help me focus on the parts that are relevant to those subtrees. I used my (pretty sloppy) programming skills and wrote a small utility that did that. I ran it and directed it to catch a session that queried a big version of AdventureWorks (produced by Adam Machanic). Here’s the outcome:
And if we dive in:
What we can see here is that the subtrees under the red-marked nodes are currently being processed. In plain English, we currently read data from the bigTransactionHistory table, while we already finished reading data from the bigProduct table. If this query does not perform good enough, the bigTransactionHistory part can be a good starting point.
Obviously this is a simple plan, but this technique can be insanely useful when you look at huge plans and have trouble figuring out where the problem is.
Here’s your part:
I can share the code, but I prefer not to The code of the utility can be found here. If you download it, please make sure to read the Readme file first. I’m not very proud of it, as it’s very sloppy and written just for the purpose of being a brain teaser. That’s where you come in: Write a decent application that accepts a session id of an active query and returns a graphical execution plan, marking the nodes that currently incur CXPacket waits, by either using the Data Explorer tool or any other method.
Taking it to the next level:
There’s a lot more that can be done. A proficient programmer can take the information from sys.dm_exec_query_profiles and show a visual indication of:
Pipes where the estimated number of rows is way off the actual
Uneven distribution of rows between threads
Most expensive part in terms of duration, number of rows, etc
By persisting the data, we can:
Analyse an execution in retrospect
Build a heat map of the most expensive parts in terms of rows, execution time, etc.
Show how the rows flow between the operators, visually.
So what do you say? Challenge accepted?