Index Structures in SQL Server

Author by Nick Adams

Why use indexes?

Just like a big 800 page text book, there is a lot of information, key words, and terms used in that book split up by chapters, pages, and lessons.  That is a lot to comb through if you are looking for a single term or group of terms.  An index in SQL Server is just like the index in the back of your textbook.  If you don’t have this index, then your text book is just an illogical pile of data that can only be discerned as information once read.  Your data is currently in a “Heap” of data.  But if you turn to the back of the book, you will find the index, your guide to finding key words in alphabetical order.  This is your “Non-Clustered Index.”  A “Clustered Index” on the other hand will organize the data based on the specific primary key which you identify, and thus a Clustered Index is AUTOMATICALLY CREATED when you create a primary key on your tables.  The data gets stored physically in a way that is unique and specific to that table.  You should always have primary keys on your tables for these reasons.

 

Index Structure

An Index is formed by a Root Level, Intermediate Level, and Leaf Level.

Index-Structure-General-(1).png

The Root and Intermediate levels are all index levels where lookups happen.  The Leaf level is a data level where individual pieces of data will be referenced once the lookup is complete.  This structure is called a "B-Tree."

 

Let’s look at how this works with a Clustered Index first.

Primary-key-index-clustered.png

Suppose we need to find the NVARCHAR or TEXT “MyDate” and we have a Primary Key column ‘ID’, a DATE column ‘Current Date’, and an NVARCHAR column ‘MyDateOrNot’.  The arrows posted will show how the Clustered Index will search for the data based on the index structure based off being organized by the primary key.  You can see the index will follow the data by row using the specific criteria of the lookup.

 

Now let’s find “MyDate” with a Nonclustered Index.

nonclustered-index.png

As you can see, the Nonclustered Index is using pointers based off an ordering of the data and the look up will perform a seek to find the correct data due to the ordering (Seek a seek will stop the lookup once the correct data is found; it does not continue to look at every row like a Scan does).

 

Now that you can see the structure of the two types of Indexes and how they work, you can appropriately use your indexes within your queries to optimize performance, particularly if your queries get heavy with JOIN operations.  You can use Clustered and Nonclustered index to help the query optimizer make decisions on what type of JOIN to use (Merge, Scan, or Hash) as well as cover your entire query within the WHERE and SELECT clauses based on what columns you are hitting.