Troubleshooting tempdb Space Usage
Tempdb is a critical resource in SQL Server. It is used internally by the database engine for many operations, and it might consume a lot of disk space.
In the past two weeks I encountered 3 different scenarios in which tempdb has grown very large, so I decided to write about troubleshooting such scenarios.
Before I describe the methods for troubleshooting tempdb space usage, let’s begin with an overview of the types of objects that consume space in tempdb.
There are 3 types of objects stored in tempdb:
A user object can be a temporary table, a table variable or a table returned by a table-valued function. It can also be a regular table created in the tempdb database. A common misconception is that table variables (@) do not consume space in tempdb, as opposed to temporary tables (#), because they are only stored in memory. This is not true. But there are two important differences between temporary tables and table variables, when it comes to space usage:
Indexes and statistics on temporary tables also consume space in tempdb, while indexes and statistics on table variables don’t. This is simply because you cannot create indexes or statistics on table variables. Well, you can create indexes as part of the table declaration, but this is not common.
The scope of a temporary table is the session in which it has been created, while the scope of a table variable is the batch in which it has been created. This means that a temporary table consumes space in tempdb as long as the session is still open (or until the table is explicitly dropped), while a table variable’s space in tempdb is deallocated as soon as the batch is ended.
Internal objects are created and managed by SQL Server internally. Their data or metadata cannot be accessed. Here are some examples of internal objects in tempdb:
Query Intermediate Results for Hash Operations
Sort Intermediate Results
Contents of LOB Data Types
Query Result of a Static Cursor
Unlike user objects, operations on internal objects in tempdb are not logged, since they do not need to be rolled back. But internal objects do consume space in tempdb. Each internal object occupies at least 9 pages (one IAM page and 8 data pages). Tempdb can grow substantially due to internal objects when queries that process large amounts of data are executed on the instance, depending on the nature of the queries.
Version stores are used for storing row versions generated by transactions in any database on the instance. The row versions are required by features such as snapshot isolation, after triggers and online index build. Only when row versioning is required, the row versions will be stored in tempdb. As long as there are row versions to be stored, a new version store is created in tempdb approximately every minute. These version stores are similar to internal objects in many ways. Their data and metadata cannot be accessed, and operations on them are not logged. The difference is, of-course, the data that is stored in them.
When a transaction that needs to store row versions begins, it stores its row versions in the current version store (the one that has been created in the last minute). This transaction will continue to store row versions in the same version store as long as it’s running, even if it will run for 10 minutes. So the size of each version store is determined by the number and duration of transactions that began in the relevant minute, and also by the amount of data modified by those transactions.
Version stores that are not needed anymore are deallocated periodically by a background process. This process deallocates complete version stores, not individual row versions. So, in some cases, it might take a while till some version store is deallocated. There are two types of version stores. One type is used to store row versions for tables that undergo online index build operations. The second type is used for all other scenarios.
There are 3 dynamic management views, which make the task of troubleshooting tempdb space usage quite easy.
The views are:
All 3 views return a column named “database_id”, so you might think that they return information for all the databases in the instance, right? Wrong!
The first view returns information about the current database only, whether it's tempdb or any other database. The other two return information for the tempdb database only, no matter from which database you reference them, so the value returned in the "database_id" column is always “2” (the database ID of tempdb). We will focus only on tempdb, anyway.
The first view (sys.dm_db_file_space_usage) returns space usage information for each data file in tempdb. It gives a high level distribution of the space occupied by tempdb. For example, when I ran the following query on my local instance of SQL Server 2019:
SELECT * FROM sys.dm_db_file_space_usage;
I received the following results:
As you can see, I have 8 data files for tempdb, each has 1,024 pages of 8 KB each, or a total size of 8 MB. For example, the first file contains 352 allocated pages and 672 unallocated pages, and out of the 352 allocated pages, 144 pages (in uniform extents) are reserved for user objects. The rest of the pages (216) are allocated from mixed extents.
In the last file (#9), for example, we can see that a single extent (8 pages) is allocated to user objects, and another extent is allocated to internal objects. All other extents are unallocated. We can also see that no version stores exist in tempdb.
When tempdb becomes unreasonably large, the first thing you will want to do is to figure out whether space is mainly occupied by user objects, internal objects or version stores. This view will give you the answer, plain and simple.
If you found out that the problem is related to user objects or internal objects, then the next step in troubleshooting will be to figure out who is occupying the space for these objects in tempdb. To answer this question, you will use the two other views mentioned above (sys.dm_db_session_space_usage and sys.dm_db_task_space_usage).
The first view returns the number of pages allocated and deallocated for user objects and internal objects by each session. The second view returns the same information for currently running tasks. For example, I created a new connection with session ID = 52. So far this session hasn’t performed any work, so the following query: