Parameterization Part 1: Plan Caching
Many times, when I perform query tuning, the problem that causes the query to perform badly is related, one way or another, to parameterization.
On one hand, parameterization is a fundamental subject in the heart of the query processor. It can affect query performance dramatically. On the other hand, it seems to me that this important subject is not covered enough and not understood well enough.
So I decided to start a series about parameterization. This is the first post in the series, and in this post I would like to write about plan caching. Understanding how execution plans are cached is essential in order to understand parameterization.
So please fasten your seat belt. Here we go…
SQL Server reserves a certain amount of memory for the plan cache. This is where execution plans (and a few other structures) are cached for future reuse. One thing that sometimes causes confusion in this regard is the distinction between a query (or statement) and a batch. The former is a single command that the query performs as an atomic unit, one at a time. The latter is a unit of execution, which can include any number of statements. SQL Server always works with batches. This is the unit that gets parsed, simplified, optimized, compiled, and finally – executed. This is the unit for which the optimizer generates an execution plan. So the plan cache stores execution plans – each one representing a batch.
You can view the contents of the plan cache by using the sys.dm_exec_cached_plans DMV. If you want to view also the batch text and the plan XML, then you can use the following query:
SELECT BatchText = BatchTexts.text , QueryPlan = BatchPlans.query_plan , ExecutionCount = CachedPlans.usecounts , ObjectType = CachedPlans.objtype , Size_KB = CachedPlans.size_in_bytes / 1024 FROM sys.dm_exec_cached_plans AS CachedPlans CROSS APPLY sys.dm_exec_query_plan (plan_handle) AS BatchPlans CROSS APPLY sys.dm_exec_sql_text (plan_handle) AS BatchTexts;
BatchText = BatchTexts.text ,
QueryPlan = BatchPlans.query_plan ,
ExecutionCount = CachedPlans.usecounts ,
ObjectType = CachedPlans.objtype ,
Size_KB = CachedPlans.size_in_bytes / 1024
sys.dm_exec_cached_plans AS CachedPlans
sys.dm_exec_query_plan (plan_handle) AS BatchPlans
sys.dm_exec_sql_text (plan_handle) AS BatchTexts;
Now that we cleared that up, let’s move on…
Whenever the query processor encounters a batch, it first checks whether it already has a plan in cache that can be reused. This is because the cost of analyzing and compiling the batch can be quite expensive. If there is already an execution plan for this exact batch that can be executed immediately, then it can save a lot of resources and time. How does the query processor search for the batch in cache? It doesn’t sound like a good idea to search and compare the text of all the batches in cache each time, does it? This is why a hash value is calculated for each batch and stored with the execution plan in cache. When a new batch is encountered, its hash value is calculated and compared to all the hash values already in cache. If a match is found, then the batch text itself is compared in order to verify that this is indeed the same batch (this is because multiple text values can potentially result in the same hash value).
Some other factors need to be compared as well. One of them is the set of values of several set options, such as ANSI_NULLS and ARITHABORT. There are 7 such set options that affect how queries are executed in the current session. Two sessions executing the exact same batch with different set options might produce different results, and this is why they have to go through distinct optimization processes and produce distinct execution plans. So when the query processor finds a plan in cache for a batch that is an exact match to the current batch it needs to execute, it still needs to compare the values of all the 7 set options between the current session and the cached plan (each cached plan is stored along with several attributes, among them are the set options).
Only if an exact match is found including all the relevant attributes, then the query processor can potentially use the cached plan. If not, then it has to go through the process of parsing, simplifying, optimizing and compiling a plan. This newly generated plan will then be placed in cache for future reuse (in most cases).
You probably noticed that I wrote “potentially”. Well, if you didn’t notice, then you’re too tired. Go make yourself a cup of coffee and then read the previous paragraph again… OK?
Are you with me? Good!
Once the query processor finds an execution plan in cache, it still has to verify that the plan is still valid. Why would a plan be invalid, you might ask? Well, there are many reasons for that. One reason, for example, is that the schema of the underlying tables and indexes has changed since the plan was generated. Maybe a column was added to the table; maybe an index was dropped… If there was a schema change of any type since the plan was generated, then the plan is n