What the heck is a BTree and why is it important to us?

I was recently discussing indexes with  some developers of mine and in our discussions the subject of a BTree came up.  You don’t really hear too much about these things every day, but they are very important to understand so I thought it would be a great blog post subject to write about today.  So, what the heck is a BTree and why is it so important to us?  Let’s first just focus on the word itself.  The meaning of the “B” in “BTree” has been debated throughout the community. Some have suggested “Balanced”, “Bayer”, “Boeing”, “broad”,  and even “bushy”.  Could it be “branch”?   Hmm.  It really doesn’t matter but the “tree” part is a bit more meaningful.   You see SQL Server stores its data like a tree.  There are different kinds of trees, B-Trees and B+Trees.   A B-Tree has the root node that contains the keys at the top and branches down to the leaf nodes to find all the data.  A B+tree is the opposite. B+Trees are therefore much easier and higher performing when performing full scans as you are already down to the leaf nodes. To do a full scan with a B-Tree you need to do a full tree traversal to find all the data.  B-Trees on the other hand can be faster when you do a seek (looking for a specific piece of data by key) especially when the tree resides in RAM or other non-block storage. Since you can elevate commonly used nodes in the tree there are less comparisons required to get to the data.  An easy way to remember is to think about it in terms of time.   Generally. tables with no indexes take longer to traverse as the data is stored as heaps in B+Trees.  Tables with indexes are typically faster as the data is stored in ordered hierarchies as a B-Tree.

heap takes more time B+   

Indexed is faster so B-

The following illustration I found on the internet somwhere but I do not recall where.  It does make it a lot easier to understand though. 

  btree