SQL Server DBA Tips that Can Save Your Ass
On November 9th I presented a session in the PASS Database Administration Virtual Chapter. The title of the session was “SQL Server DBA Tips that Can Save Your Ass”. Sharon Dooley, the virtual chapter leader, kindly modified the last word to “A$$”, but the important thing is what’s in it for the DBA – 10 useful tips that can help you do your DBA work better.
There were 298 attendees, and I was told that this may be a record for a session in the PASS DBA Virtual Chapter. I didn’t expect so many attendees, and it was exciting for me to see the number of attendees going up, hitting 100, then 200, and then almost reaching 300. And there were also many great questions from the audience. Unfortunately, because there were so many great questions, I didn’t have enough time to go over all the materials, and also to answer all the questions. This session could easily span two hours, as many people wrote in the feedback. I spoke with Sharon about it, and hopefully I’ll be back for part 2 of this session sometime soon.
In the meanwhile, I would like to answer the questions that were asked during the session. I will order the questions according to the tips presented in the session…
Tip #1 – SQL Server Setup Best Practices
When you say “Install SQL Server on a Dedicated Machine” – do you mean dedicated physical server or dedicated virtual machine?
In a production environment, it is recommended to install SQL Server on a dedicated machine, which is intended only to run the SQL Server instance. Don’t install other application, such as SharePoint or TFS on the same server. This will allow SQL Server to max resource utilization and perform at its best. Of course, it’s important to estimate the correct amount of resources (CPU, memory and IO), and install SQL Server on an appropriate machine. If you have the appropriate amount of resources, then it doesn’t matter if it’s a physical machine or a virtual machine. If you use a virtual machine, then you do need to know how to configure the hypervisor in such a way that your virtual machine does not share resources with other virtual machines.
Do you recommend to use domain accounts or managed service accounts for SQL Server services?
Well, I got this one all wrong. My slide says you should use domain accounts. This is an old slide, and I haven’t updated the recommendation for quite a while. Sorry about that. But it’s never too late to fix it…
So before SQL Server 2012 and before Windows 2008 R2, the recommended approach was to use domain accounts. This allows for a central management of all SQL Server services across the domain, and it was more secured, because we had the ability to create a dedicated account for each service with only the required permissions. The alternative was to use local accounts (no central management) or built-in accounts (shared by many services and thus require too many privileges).
Windows 2008 R2 introduced Managed Service Accounts, and the first version of SQL Server to support it is SQL Server 2012. With Managed Service Accounts (MSA), you get automatic password management, which means strong passwords are set and replaced automatically, and you don’t need to deal with it. The other advantage of MSA is that it can automatically register an SPN for Kerberos authentication.
A single MSA can only be used on a single host, so it is not suitable for clustered environments, such as Failover Cluster Instances or Availability Groups. But if you’re running SQL Server 2016 on Windows 2012 and above, then you can use Group Managed Service Accounts (gMSAs), which can be used across multiple hosts.
So here is a summary of the service accounts best practice:
Before SQL Server 2012 or before Windows 2008 R2 – Use Domain Accounts
Standalone SQL Server 2012 and above on Windows 2008 R2 and above – Use MSAs
Clustered SQL Server (before 2016) – Use Domain Accounts
Clustered SQL Server 2016 on Windows 2012 and above – Use gMSAs
Is Instant File Initialization (IFI) necessary with flash storage?
Yes. It’s true that the impact of IFI is much lower with flash storage than with magnetic storage, but there is still an impact. And if you need to restore a very large database to a flash storage, then IFI can still save you a few precious minutes of downtime.
How important is it to set the min server memory as well as the max server memory?
It is important to distinguish between two scenarios. The first is when your SQL Server instance is the only meaningful process running on the server (the first tip on the slide). The second scenario is when you have multiple SQL Server instances or multiple processes running side by side on the same host.
In the former scenario, you still need to set the max server memory to a lower value than the total amount of physical memory. SQL Server is supposed to leave enough memory for the operating system, but unfortunately it doesn’t do a good job, and when the operating system suffers, everybody suffers, including your SQL Server instance. But there is no point in modifying the min server memory in this scenario.
In the latter scenario (multiple processes), you should set the max server memory for each SQL Server instance in order to balance the memory usage by the various processes. You can also set the min server memory to make sure that your instance gets to keep at least a minimum amount of memory. It’s a process of trial and error until you get to the point of balance between the various processes.
Tip #2 – Contention on Allocation Pages in tempdb
Does each data file in tempdb need to be on its own drive?
If you can afford to have enough drives for tempdb, then placing each data file on a separate drive will definitely improve IO performance, as multiple drives will be used in parallel. But even if all data files are located on the same drive, you will still get the benefit of multiple worker threads running in parallel, and you will also reduce contention on allocation pages, because those allocation pages are split between the files, regardless of the drive they are sitting on.
What is the recommended size of each tempdb data file?
Unfortunately, there is no easy answer to this question. tempdb is a critical resource in SQL Server, and it is used for so many things, such as temporary user objects, internal objects, row versions, cursors, index operations, and more. The recommended size of tempdb heavily depends on the size of the user databases, the workloads, the features used in the instance, and more.
But the one thing I can say is that you should create all data files with the same initial size and the same auto-growth configuration, and you should also enable trace flag 1117 (before SQL Server 2016) or use the AUTOGROW_ALL_FILES database option (SQL Server 2016), in order to make sure that all the data files continue to have the same size. This ensures optimal distribution of the data and load between the files.
What is the recommended RAID level for tempdb?
Since tempdb is a temporary database, storage redundancy is not as important as with user databases. Still, placing tempdb files on a RAID array, such as RAID 1, will ensure that SQL Server continues to operate even when a single disk fails.
tempdb is a heavy-write database, so it is recommended to place its files on RAID levels, which are optimized for write performance. These RAID levels are either RAID 0 (no redundancy) or RAID 10 (stripe of mirrors).
Does the RAID level for tempdb files matter when flash storage is used?
RAID 0 or RAID 10 configurations are optimized for heavy-write workloads, and they improve performance of such workloads, regardless of the type of disks. So even with flash storage, you will most probably get better performance from tempdb if it uses RAID 0 rather than RAID 5.
Tip #4 – The ARITHABORT Problem
Isn’t it lighter on SQL Server to use Extended Events instead of Profiler?
Absolutely. Extended Events has less overhead on SQL Server than SQL Trace of Profiler. It also has more functionality, such as global fields (actions) and multiple targets. Microsoft states that Profiler is deprecated, and that you should use Extended Events from now on.
I usually use Extended Events, including in my demos. I decided to use Profiler this time, because many DBAs are still not familiar with Extended Events, and I didn’t want to waste time on explaining what I’m doing with Extended Events, as this wasn’t the purpose of the session.
But you should definitely start using Extended Events and learn how to leverage its powerful features.
What are the best practices for the SET options?
The recommended settings, which are also required when dealing with indexes on computed columns or indexed views, are:
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT ON
SET QUOTED_IDENTIFIER ON
SET CONCAT_NULL_YIELDS_NULL ON
SET NUMERIC_ROUNDABORT OFF
How do we get the application to run as fast as SSMS?
In this tip, I demonstrated how the different setting of ARITHABORT between the application and SSMS causes each one to get a different execution plan, which in some cases can lead to a huge difference in performance. The setting of ARITHABORT has nothing to do with the different execution plan. It only caused a new execution plan to be generated. The fact that SSMS used a much better plan than the application in our demo was just a matter of luck, and this is what happens in most cases. The new plan that was generated for SSMS was based on some other factors, which were different than when the plan was first generated for the application. In most cases the difference is caused by parameter sniffing, where different parameter values were used by the application and by SSMS.
So we need to investigate and understand the reason why SSMS got a better plan. As soon as we understand that, we can think of ways to make sure that the application gets the same plan. For example, if it’s a matter of parameter sniffing, then we can use the OPTIMIZE FOR or the RECOMPILE query options.
Tip #8 – Delayed Durability
Is delayed durability set by table or database?
First, you need to set delayed durability at the database level. There are 3 options: DISABLED, ALLOWED and FORCED. When set to DISABLED (the default setting), all transactions in the database are fully durable, regardless of the transaction level setting. With FORCED, all transactions in the database are delayed durable, regardless of the transaction level setting. When set to ALLOWED, you can specify the durability of each transaction individually as part of the COMMIT statement.
There is no option to set durability level for a table.
Is it recommended to use Windows Authentication or SQL Server Authentication for the application?
Generally speaking, it is considered a best practice to use Windows Authentication over SQL Server Authentication, because it is more secured and it simplifies account management.
From the application perspective, it is considered a best practice to separate between the security context of the users and the security context in which the application server connects to the database. It provides more flexibility and ease of maintenance.
So the common approach is to create a dedicated account for the application. This can be a domain account (recommended) or a SQL Server account (common practice). Security within the application is then managed at the application level.
Is trace flag 4199 still required in SQL Server 2016?
Trace flag 4199 is used since SQL Server 2005 to collect query optimizer hotfixes that were intended to become on-by-default in a future release. There are many such hotfixes included in trace flag 4199 prior to SQL Server 2016. Before SQL Server 2016, you had to enable this trace flag in order to apply all the hotfixes. This was a risky operation, because it could produce unexpected results.
Microsoft changed the servicing model of trace flag 4199, and starting with compatibility level 130 (SQL Server 2016), all the hotfixes previously included in the trace flag are now enabled by default. The same trace flag is now used to collect new query optimizer hotfixes, which will be enabled by default in a future compatibility level.
For more information, see this Microsoft KB article.
Should I use multiple filegroups for user databases?
There are several reasons why you might want to consider using multiple filegroups for your user database, rather than just putting everything in the PRIMARY filegroup.
The first reason is performance. If you have multiple disks or multiple disk arrays, and you want to increase IO throughput, then you can split the tables and indexes between multiple filegroups, and place the data files within each filegroup on separate disks. You can also optimize the disk layout by placing objects that are commonly used together, such as frequently joined tables, or tables (heaps or clustered) and their corresponding non-clustered indexes, on different filegroups.
The second reason is availability. When a disaster strikes, and you need to restore a very large database, you can perform the restore in stages. As soon as you restore the PRIMARY filegroup, the database becomes online, and then you can continue to restore single filegroups, one of after the other. As soon as each filegroup restore completes successfully, the data on that filegroup becomes available.
The third reason for using multiple filegroups is storage tiering. If you have different types of storage, such as SSD and SATA, then you probably want to optimize the use of each storage type – put your hot data on SSD and the cold data on SATA. This method can be achieved by using multiple filegroups.
But if your user database performs well, not very large, and you only have a single type of storage, then you don’t necessarily need to use multiple filegroups. Putting everything in the PRIMARY filegroup is a good method for many user databases.
Should we separate non-clustered indexes and clustered indexes to different disks?
See the first reason (performance) for using multiple filegroups in the previous question…
Any guideline for preventing issues with parameter sniffing?
There are many guidelines for dealing with parameter sniffing issues. In fact, I wrote a whole series of blog posts about parameterization in SQL Server, so instead of answering here, I will direct you to the series. I also added a link to this series in the slide deck for your convenience.