Troubleshooting SQL Server: A Guide for the Accidental DBA

Overview

imageFrom the moment I heard that Jonathan Kehayias and Ted Krueger were writing a book together (and that Gail Shaw was going to be doing the Technical Editing) I knew that I would be grabbing it as soon as possible.  As it happens, they released this one at the Pass Summit, so I was able to talk to all three after its’ release (albeit it briefly with Jonathan and Gail).  It did, however, allow me to act quite silly and run up to Ted like a gushing fan-boy asking for his autograph (this was based on advice of Jes Borland-Shultz); his response was great in that he turned bright red and started to groan and ask “are you kidding me?”.  Quite fun.

All kidding aside though, this book was a good primer on some of the more important topics that a DBA will face.  I have already started to recommend it to some of my co-workers and to the local user group.  One of the only things I have to nit-pick about the book is the title.  I don’t know that I would have put in “Accidental DBA” as this will surely help out lots of people who have willingly taken on the title, and it might cause some people who would get lots of good info here to steer clear.

Anther thing to note about the book is that you can grab an electronic copy for free! (http://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-the-accidental-dba/).  Also, note the zip location for the corresponding code (http://www.simple-talk.com/RedGateBooks/JonathanKehayias/TroubleshootingSQLServer_Code.zip )

Chapter 1: A Performance Troubleshooting Methodology

In this chapter the reader is exposed to a series of steps that can be used to drill down into what problems the instance might currently be experiencing.  This starts by looking at a combination of the wait statistics, the virtual file statistics, several performance counters and the plan cache to get a feel for the state of the instance.  All of these areas have an explanation as to what the results will signify, and, more importantly, where we might dig in a little deeper to expose the root of the problems.

Chapter 2: Disk I/O Configuration

In this chapter, the authors present lots of great background information on storage.  There is a very good section presenting the basic information on the different RAID levels, the differences between direct attached storage and SAN, as well as some basic SAN terminology.  We are further exposed to some basic information on disk speeds, seek time, Random versus Sequential reads and writes, the patterns for data files and log files (as well as the special patterns for tempdb) and the concept of IOPS.  In addition, the authors mention a few different tools that might be helpful in measuring I/O Performance.  Finally, the chapter concludes with a few sections on diagnosing disk I/O issues and some common disk performance problems. 

Chapter 3: High CPU Utilization

Ted does a good job at giving us some quick hints for checking on high CPU utilization such as using Performance Monitor, capturing trace information and examining the dynamic management views.  However, the real meat of the chapter is the extensive list of topics that are covered which might be the cause of high CPU: Missing Indexes, Outdated Statistics, Search Argument usability, Implicit conversions, Parameter sniffing, non-parameterized queries, inappropriate parallelism, and Windows / BIOS Power saving mode.  In addition there are a lot of great links to get more background information on these and other items. 

Chapter 4: Memory Management

Jonathan goes into a good amount of detail explaining the concepts of how the SQL Engine allocates memory and more importantly how it deals with memory pressure.  He then goes into insane detail on the differences in 32 bit memory versus 64 bit memory and what the implications of each entail.  In fact, I would have to say that he perhaps goes a bit too far into the 32 bit memory as I hope that most people these days have left their 32 bit processors behind sometime in the mid 2000’s. 

After this explanation we are shown a few pointers for memory relating to trace flags and operating system settings before diving into various performance counters and DMVs that will  be useful in diagnosing memory problems.  Finally, we are shown several common memory related problems as well as their root causes (or it is noted that the symptoms reported are not really a problem).

Chapter 5: Missing Indexes

The chapter starts out with some basic information on indexes, key column ordering and the use of include columns before moving on to the tools that one might find useful in identifying indexes which will help the workload of the server.  This includes information on running traces, using the Data Tuning Advisor, looking at the missing index DMVs and examining query plans and the plan cache.  With that said, this chapter should have perhaps just been named “Indexes” as in addition to lots of great information on determining which indexes may be missing, we are also shown how we can identify where we have indexes which might not be necessary. 

Chapter 6: Blocking

The second of Ted’s chapters is a good introduction to some of the elements that every DBA will need to learn about: Isolation levels, locking and blocking.  After we have gone through these basics we are next shown how we can monitor for blocking within the database.  We are first shown how we might accomplish this using backwards compatible processes for SQL 2000 before moving on to the dynamic management views (do we sense a pattern with the DMVs here?  Yeah, they are extremely helpful in administering your systems…) before once again moving on to performance monitor.  Next we see some automated methods for capturing this information from using trace flags, to traces to event notifications and extended events. 

Finally, we get to the heart of the matter: How do we resolve these issues? Unfortunately, this book must succumb, like all of the others, to the fact that there are no silver bullets. Instead, we come to the understanding that resolving these issues will more than likely take a combination of many items: from database design (or redesign), to index tuning, to watching our isolation levels, and using query hints.  And, since every database is built just a little bit different on a different hardware configuration with differing front end applications, your experiences at tuning them will be different every time.  Which is part of the fun of being a DBA, right?

Chapter 7: Handling Deadlocks

This chapter builds on the previous chapter relating to blocking particularly in light of the fact that a deadlock is really just a special form of blocking.  As such, a lot of the techniques for detecting and dealing with these issues are going to be the same.  However, we do get a good explanation of what deadlocks are along with some great explanations of the methods that we use to gather information which is specific to deadlocks, in particular obtaining the deadlock graphs (and how to read them).  Finally, we look at some of the common causes of deadlocks and how we can alleviate them.

One of the key takeaways from this chapter, and possibly from this book, is in the source code that is available as a download for the book: The event notifications that can be used to capture deadlock information.  This is similar to one of the sessions that I have seen Jonathan give at an SQL Saturday event a few years back, which itself is probably worth your time to look over the slides and code (http://sqlsaturday.com/viewsession.aspx?sat=32&sessionid=1069). 

Chapter 8: Large or Full Transaction Log

This chapter once again starts with the basics; this time we learn about the transaction log, Virtual Log Files, sizing and growing of your log files, and understanding log truncation.  We look into why the transaction log might be growing, from rebuilding or reorganizing your logs, to long running and uncommitted transactions. Finally we look at how we can handle the “transaction log full” errors, and, perhaps more importantly, how we SHOULDN’T handle these errors.

Chapter 9: Truncated Tables, Dropped Objects and Other Accidents Waiting to Happen

This chapter walks through a few different methods of restoring databases using marked transactions, point in time and full along with some of the various options that the restore process can use to aid in examining the contents of the database.  These are all important as they relate to recovering data or dropped objects.  The chapter also briefly mentions source control, log recovery tools and determining who might have been the cause of the issues at hand.  Finally there is a brief mention of security, along with the principal of least privileges, and implementation of triggers to help prevent inadvertent data and / or schema modifications.

Appendix: What to Do When All Else Fails

A very quick look through forums, blogs, the Microsoft CSS, and Consultants.  There are probably some people that will not have heard of some of the fine blogs or forums listed, particularly if they are new to the DBA area.

I read this book between 2011-11-05 and 2011-11-25.

About these ads
This entry was posted in Books, SQL and tagged . Bookmark the permalink.

4 Responses to Troubleshooting SQL Server: A Guide for the Accidental DBA

  1. Wow! Thanks for the awesome review, Eric. And I still feel all red in the face from that autograph request :) Will probably never get used to that. I’m still simply amazed Jonathan asked me to work on this with him. I agree on the title also. I’ll be using this book myself for years to come. It has content that is valuable to even the most seasoned DBA.
    Thanks again. Made my night!

  2. DV says:

    Excellent review. Thank you. I’ll grab a copy.

  3. Pingback: Book Review: Troubleshooting SQL Server A Guide for the Accidental DBA | Salvo(z)

  4. SDdesh says:

    Well reviewed. I find myself reading every word in the book, having complete value. Wish I could fit the whole book in my memory :)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s