What the heck is a BTree and why is it important to us?

I was recently discussing indexes with  some developers of mine and in our discussions the subject of a BTree came up.  You don’t really hear too much about these things every day, but they are very important to understand so I thought it would be a great blog post subject to write about today.  So, what the heck is a BTree and why is it so important to us?  Let’s first just focus on the word itself.  The meaning of the “B” in “BTree” has been debated throughout the community. Some have suggested “Balanced”, “Bayer”, “Boeing”, “broad”,  and even “bushy”.  Could it be “branch”?   Hmm.  It really doesn’t matter but the “tree” part is a bit more meaningful.   You see SQL Server stores its data like a tree.  There are different kinds of trees, B-Trees and B+Trees.   A B-Tree has the root node that contains the keys at the top and branches down to the leaf nodes to find all the data.  A B+tree is the opposite. B+Trees are therefore much easier and higher performing when performing full scans as you are already down to the leaf nodes. To do a full scan with a B-Tree you need to do a full tree traversal to find all the data.  B-Trees on the other hand can be faster when you do a seek (looking for a specific piece of data by key) especially when the tree resides in RAM or other non-block storage. Since you can elevate commonly used nodes in the tree there are less comparisons required to get to the data.  An easy way to remember is to think about it in terms of time.   Generally. tables with no indexes take longer to traverse as the data is stored as heaps in B+Trees.  Tables with indexes are typically faster as the data is stored in ordered hierarchies as a B-Tree.

heap takes more time B+   

Indexed is faster so B-

The following illustration I found on the internet somwhere but I do not recall where.  It does make it a lot easier to understand though. 

  btree

Graphical Execution Plan

The execution plan is the method by which SQL Server accesses the data that resides in the underlying tables.  It examines your query then determines the most efficient way to retrieve the data.  Execution plans are compiled and stored with stored procedures, so SQL Server will know the best way to access data when a given stored procedure is called.  SQL Server provides a GUI that displays how the execution plan is used to retrieve the data.  When developing procedures, you should utilize the graphical execution plan to assist in writing the most efficient stored procedures.  Please keep the following tips in mind when examining the graphical execution plan:

  • If you would like to see the execution plan, but not execute the query, choose “Display Estimated Execution Plan”.  This will show you an estimate of the actual execution plan.  Note that this will not work if temp tables are used because the temp tables cannot be parsed until runtime.  If you use a table variable, however, the estimated execution plan will return successfully.
  • Graphical Execution plans are read from right to left, and the arrows on each step correspond to the number of rows returned for that step.  The fatter the arrow, the more rows returned.  The execution plan eventually merges into one last step on the left side of the plan.  Thick arrows should appear on the right of the execution plan and get smaller as the plan moves left.
  • Execution plans are broken down into smaller substeps.  Each substep is displayed as a percentage of runtime versus the query as a whole.  For a long running query, identify the substep with the highest percentage and begin tuning at that step.
  • Each action within the substep is displayed as a percentage of runtime versus the substep as a whole.  For long running queries, look to tune the long running actions.
  • If you move your cursor over any of the displayed actions, a pop up window appears with more detailed information on that particular step.
  • If you see the below in an execution plan, look to modify the query so that these actions are eliminated or reduced.  It’s not always possible to avoid them, but the more they can be prevented, the better performance will be:
    • Index or table scans: May indicate a need for better or additional indexes
    • Bookmark Lookups:  Consider changing the current clustered index, consider using a covering index, reduce the number of columns in the SELECT statement
    • Filter: Remove any functions in the WHERE clause , don’t include Views in your Transact-SQL code,  may need additional indexes
    • Sort: Does the data really need to be sorted?  Can sorting be done at the client more efficiently?
  •  While I/O and CPU cost don’t directly refer to distinct measurements (they are internal SQL counters), higher I/O and CPU cost values in the execution plan utilize more resources than lower values
  •  Another cause of Bookmark lookups are the use of “SELECT *”
  •  Sometimes SQL Server will need to create a temporary worktable in the tempdb database.  Anytime a worktable is used, performance is hurt because of the additional I/O for maintaining the worktable. 
  • The graphical execution plan will warn you that a worktable is being used when the following actions are displayed:
    • Index Spool
    • Row Count Spool
    • Table Spool

Auto-Magically visualize your Perfmon and Profiler results

Troubleshooting SQL Server issues usually involves setting up a perfmon and profiler trace and then trying to identify the issues based on particular times and high reads or writes etc.  These have been industry standard ways to diagnose issues, but did you know you can integrate the two to get the best of bost worlds in a pretty little interface?  In this post I would like to go over how you can integrate the two together and dynamically see what is causing the various spikes to occur.

 

Setting up Perfmon

Open up PerfMon by going to Control Panel -> Administrative tools -> Performance.  Next go to the Performance Logs and Alerts section. Right click Counter Logs and select New Log Settings. Type a name for the log. 

 

 Add all the counters you want to monitor with the Add Counters button.  I typically like to include the following:

MemoryAvailable MBytes
 MemoryCache Bytes
 MemoryPages/sec
 MemoryPool Paged Resident Bytes
 MemorySystem Cache Resident Bytes
 MemorySystem Code Resident Bytes
 MemorySystem Driver Resident Bytes
 PhysicalDisk(_Total) Disk Write Time
 PhysicalDisk(_Total)Avg. Disk Bytes/Read
 PhysicalDisk(_Total)Avg. Disk Bytes/Write
 PhysicalDisk(_Total)Avg. Disk Queue Length
 PhysicalDisk(_Total)Avg. Disk Read Queue Length
 PhysicalDisk(_Total)Avg. Disk Write Queue Length
 PhysicalDisk(_Total)Disk Read Bytes/sec
 PhysicalDisk(_Total)Disk Write Bytes/sec
 Processor(_Total)Access MethodsCount Lob Readahead
 SQLServer:Access MethodsFull Scans/sec
 SQLServer:Access MethodsPage Splits/sec
 SQLServer:Access MethodsWorktables Created/sec
 SQLServer:Buffer ManagerBuffer cache hit ratio
 SQLServer:Cursor Manager by Type(_Total)Cursor memory usage
 SQLServer:Cursor Manager by Type(_Total)Cursor worktable usage
 SQLServer:Databases(_Total)Active Transactions
 SQLServer:Databases(_Total)Log Flushes/sec
 SQLServer:Databases(_Total)Percent Log Used
 SQLServer:Databases(_Total)Transactions/sec
 SQLServer:General StatisticsActive Temp Tables
 SQLServer:General StatisticsProcesses blocked
 SQLServer:General StatisticsUser Connections
 SQLServer:LatchesLatch Waits/sec
 SQLServer:LatchesTotal Latch Wait Time (ms)
 SQLServer:Locks(_Total)Lock Timeouts/sec
 SQLServer:Locks(_Total)Lock Wait Time (ms)
 SQLServer:Locks(_Total)Lock Waits/sec
 SQLServer:Locks(_Total)Number of Deadlocks/sec
 SQLServer:Memory ManagerConnection Memory (KB)
 SQLServer:Memory ManagerLock Memory (KB)
 SQLServer:Memory ManagerSQL Cache Memory (KB)
 SQLServer:Memory ManagerTarget Server Memory (KB)
 SQLServer:Memory ManagerTotal Server Memory (KB)
 SQLServer:SQL StatisticsSQL Compilations/sec
 SQLServer:SQL StatisticsSQL Re-Compilations/sec

 

Go to the Log Files tab and you can choose the log file type I typically use the Binary File but you can use any means you would like.  Hit Configure button and specify the filename and location

 

On the third tab you can schedule the start time etc.

When you click OK your log starts gathering data immediately. The status is shown with the green icon in the list next to log name in the list.

Set up our SQL Profiler Trace

Next we start our profiler trace.  You can do that by clicking  Start–> Programs–>Microsoft SQL Server–> Performance Tools–>SQL Profiler.

When it opens up go to File -> New Trace. Choose the server you wish to monitor. A window to set the Trace properties will open. 

In the General tab check the save to file box and select the file and location.

For the events you can choose all the ones you are interested in.  There are so many things you can profile for but I typically use just 4 and I include all their columns (You must make sure to include both start and end times or else you will not be able to correlate):

 

Once you have all your data collected you need to stop both the profiler trace and the perfmon trace. 

 Now its time to merge the two together.

To do so make sure you have closed all your profiler windows and then reopen your profiler trace.  (If you do not close Profiler and reload the trace some options will be greyed out.)

Next click File–>Import Performance Data–> and then select your Perfmon trace file.

 

 

Once you do so the two will automagically be linked together and you can select the counters you want to include.

 

 

Conclusion

When you integrate the PerfMon stats with the SQL Profiler Trace data together you can easilly determine the exact SQL Statements that were running when you experiences certain bottlenecks.  Try it out yourself and see how helpful it is. 

Cool Trick to clean up Dynamic SQL readability issues

As a DBA, I usually do not recommend using dynamic sql as a best practice.  Some of the reasoning behind this is because of it’s SQL injection flaws as well as its inability to cache plans.  My biggest pet peeve with Dynamic SQL though is that it is just so hard to read and debug.  I have spend countless hours trying to figure out just where that missing ‘  was supposed to go.  I came across and interesting article the other day by Sean McCown  and he showed a few really powerful and easy ways to change all of that.  I tried his solutions myself and they worked so well that I decided to create an example and blog about it myself. 

In the first trick we can just declare a variable and set that variable to the ASCII value for single ‘  and append that on either side of a ‘ mark.  This works for just about everything.  Then there is another trick that you can use to interactively change the parameter while you are writing your code blocks.  This trick uses the quotename function.  Try out the examples below and see for yourself how cool these tricks are and how much of a difference they make in cleaning up the readability issues.   

--Trick 1
--This is a nice trick to make the dynamic strings more readable
declare @Tick char (1)
set @Tick = CHAR(39) --ascii code for single quote (same as '''')
print @Tick + 'today' + @Tick


--Trick 2
--This trick does the same thing but shows SQL is smart enough to know it can close the quote by itself
declare @Tick char (1),
@lname varchar(20)
set @lname= 'Martin'
set @Tick = '[]' --you can alternate using '[' with ']' or '}' etc getting same resultset
print 'Where Lastname = ' + quotename(@Lname, @Tick)
--quotename just says surround variable with whatever is declared for @tick
--and @tick is optional and could be left out entirely(the '[' is default and optional) see below as an example
print 'And Mydate = ' + quotename(Getdate())
--you can also mix and match the quotes in the same batch try to replace ')' with Char(39) etc below
print 'Where Lastname = ' + quotename(@Lname, ')')