Updated: Feb 2
This series of articles will cover the first steps you should take when entering a new environment, aiming to improve overall performance.
Since there are endless things and approaches, we will present the tools, methods, and tips and divide the possible scenarios you might run into and how to deal with any of them.
Each chapter will focus on a different aspect you should be familiar with when facing a performance issue, either a current crisis or a general "slowness." This guide is aimed at junior-intermediate experience level DBAs but might be interesting for experts as well.
In our previous post, we've discussed the analysis phase before actually doing anything to get a better perspective and plan your road.
This chapter will focus at:
Past, Present, Future – Different Strategies
The approaches of handling a scenario depend, first of all, on whether the problem happened in the past (with no guarantee it will happen again in a reasonable time from now)/ is happening now/ expected to happen again in the near future. Depending on that, you shall pick different tools that can provide you with the relevant information you lack. Past – The most challenging scenario to deal with. Unless there is any existing monitoring (3rd parties/ SQL-Server based), it will be hard to extract relevant data. In such cases, the main goal will be to find correlations with events that might result in slower performance/ timeouts.
Use DMVs that might reveal information regarding what happened. For example, sys.dm_os_wait_stats collects aggregative wait stats since server’s uptime. Another example - msdb.dbo.suspect_pages documents corruptions. If it happened not too long ago, maybe examining the current status will help too – use sys.dm_exec_requests to see any session that is currently active (request).
For deadlocks and events that are logged, you can use Extended Events' default sessions: "system_health" and "AlwaysOn_health." The former is on and collecting data by default once the server starts (unless someone changed it), and the latter is on once Always On is configured. These two save data, but not forever. Also, check SQL Server's logs.
Another tool, outside of SQL Server, for events analysis will be "Event Viewer." You might find "events" that might correlate with a performance issue that happened.
Links for DMVs, Extended Events, Logs, and Event Viewer below.
Present – When there is a performance issue *now*, the goal is to find the process at the database side, which is considered problematic at the app side. Once you detect the relatively slow session/s at the database side, you should seek the root cause – why is it running slower than usual. Before we dive into tools for analysis, it is also important to remind ourselves about the questioning phase – the analysis will differ depending on the conclusions from the questions asked. For example, if it is a sudden dramatic downgrade of general performance ("everything is stuck/slower"), then the analysis flow differs from a case of a specific procedure that takes 20 seconds instead of 5. It is also important to point out that even when the app suffers from performance, it is not necessarily because of the database.
Close Future – When we are lucky enough and can assume (or maybe even know) when the issue will happen again, we can use monitoring tools to catch it and analyze it precisely. Depending on the situation, we will always prefer to "be there" when it happens and see live information, as described in the "present" section. However, it is not always possible – predicting/knowing the exact time, out-of-work-hours, and more.
Tools and methods for monitoring:
SQL Server Profiler – An events-based monitoring tool, built-in with SQL server. It can monitor a wide variety of events. It can filter events by many methods such as plain text, objects, duration, and more. It can save the collected data to a table, file that can be read either by GUI or script and more. And can be executed by GUI or script. Its main downgrade is its performance. By executing through GUI, it can crash a production server. The performance cost reduces dramatically by executing via script and saving the data to a table instead of by GUI. However, it is still something that should be taken into consideration.
Extended Events, a newer tool for an events-based monitoring tool, is built-in with SQL Server. Its performance impact is minor compared to SQL Server Profiler, which has a greater variety of events to monitor with vaster functionalities. That is also its downside. It is harder to configure, read and analyze, though not too complex to get after a few experiences.
3rd Party monitoring tools – Not every organization has a monitoring product. When there is such a tool, it might help to understand its utilities and functionalities. It might be enough without needing to configure any other monitoring tool.
Self-configured monitor – Putting 3rd parties aside, since they cost money that not every organization has, there are things that the built-in monitoring tools won't provide you. For example, wait-stats of a request during its execution, its io performance stats progress, and more. Such data can be easily received by joining DMVs. Those queries can be stored as a procedure that can be executed periodically by a job that will insert data to a table dedicated to that, every 5 seconds, for example (querying on DMVs has minimum impact), and another job will handle history deletions. With minimal effort, you get a basic monitor that can provide you time slices of what happened every X seconds. It can be beneficial for basic or even deeper analysis. There are pr