SQL Server Error Log Management
Recently, while I was working at 888.com, I noticed the DBAs there manage SQL Server logs on their instances in a different way than the default behavior. I think that this log management is a good practice, so I wanted to share it with you.
But, first, let’s start with some background…
SQL Server maintains its own log, also called “SQL Server Error Log”. This log contains messages describing informational and error events, similar to messages that you can find in Windows logs. In fact, many of the messages found in the SQL Server Error Log can also be found in the Windows Application Log. The SQL Server Error Log is a great place to find information about what’s happening on your database server.
SQL Server uses 7 log files to store these messages. One file serves as the current log file, and every new message is written to that file. The other 6 files are archived files, and they contain previous messages. Each time SQL Server is restarted, it recycles the files. What does it mean? First, it means that a new log file is created and becomes the new current log file. Second, the oldest log file (“Archive #6”) is deleted. And third, all the other log files are pushed back one step. The previous current log file becomes “Archive #1”, the previous “Archive #1” log file becomes “Archive #2”, and so on.
The log files are stored in the log folder of the instance. This folder is located by default in “Program FilesMicrosoft SQL ServerMSSQL10_50.MSSQLSERVERMSSQLLog” (for a SQL Server 2008 R2 instance). You can configure the location of the log folder during installation, and you can also change it using the “-e” startup parameter of the SQL Server service.
So far I described the default behavior of SQL Server. There are three problems with the default behavior of SQL Server. The first problem is that you have no control over the size of the log files. One file can be very large while another file can be very small. It would be easier to manage these files and work with them, if they had a more controlled and predictable size. The second problem is that you have no control over the recycling process. One file can contain two months of messages, while another file can contain only a few hours of messages. It’s difficult to find a specific message from a specific point in time. The third problem with the default behavior of SQL Server is that there are only 7 log files, and if recycling happens too often, then you might not have enough history.
In order to solve the first two problems mentioned above, all we need to do is to control the recycling process. The sys.sp_cycle_errorlog system stored procedure will do the job. Each time you call this stored procedure, SQL Server performs the recycling process as mentioned above. So what you can do is create a job that runs on a fixed schedule (let’s say – daily), which calls this stored procedure. This way, each log file will contain messages for a single day. The result is a more predictable size for the log files, and it’s also easier to locate messages from a specific day.
In order to solve the third problem (not enough history), all we need to do is to increase the number of log files. You can determine how many archive log files to keep (in addition to the current log file). The default is 6, but you can change it to any number between 6 and 99. One way to do it is by right-clicking the “SQL Server Logs” folder in SSMS and choosing “Configure”. You will be presented with the following dialog:
Check the checkbox and specify the number of archive log files you would like to have. For example, 888.com uses 30 log files, one for each day, so at any point in time they have a log history of 30 days.
Instead of using the GUI, you can also use the following statement:
EXECUTE sys.xp_instance_regwrite N'HKEY_LOCAL_MACHINE' , N'SoftwareMicrosoftMSSQLServerMSSQLServer' , N'NumErrorLogs' , REG_DWORD , 30; GO
Everything I wrote here is related to the SQL Server Error Log. Similarly, SQL Server Agent also maintains its own log in the same way. The only differences are:
There are 10 files instead of 7.
You can’t change the number of log files.
The log files are recycled each time SQL Server Agent is restarted.
You use the dbo.sp_cycle_agent_errorlog system stored procedure, located in msdb, to force a recycle.
You can configure the properties of the SQL Server Agent Error Log by expanding “SQL Server Agent” in SSMS, then right-clicking “Error Logs”, and then choosing “Configure”. You will be presented with the following dialog:
Here, you can specify the location of the SQL Server Agent error log files. By default, they are located in the same folder as the SQL Server error log files. You can also specify the format of the error logs (Unicode or not). And finally, you can choose the types of messages to be sent to the error log based on message severity. You can choose any combination of errors, warnings and informational messages. As you can see, there is no option to determine the number of log files. There are always 10 files.
So in order to summarize this post, I created a script that implements the recommended practice as described here.
The script does the following:
Sets the number of SQL Server log files to 30
Creates a job that runs every midnight and recycles the log files. The job contains two steps – the first recycles the SQL Server log files, and the second recycles the SQL Server Agent log files.
Next time you need to investigate a database failure that occurred 12 days ago between 04:00 and 06:00 in the morning, at least you know where to search for the error messages…
Another thing that the DBAs at 888.com implement on all their instances is trace flag 3226. Without this trace flag, every time a backup on the instance is completed successfully, an entry is written in the SQL Server Error Log. If you have frequent log backups of multiple databases, these entries can add up quickly and consume a lot of space in the logs. In this case, it would be a good idea to use trace flag 3226 as a startup parameter. This trace flag suppresses those successful backup entries.
I would like to thank Guy Hochman from 888.com for sharing this information with me.