16 May 2015

Assume I have a composite index on a table. While searching the data, will it use statistics of only first column or statistics of all the columns?

Lets understand this by an example(by the way I love to give such examples).

You have a travel book and it has description of all the beautiful places in the world. You do not want to read the entire book but straight away want to check out 2-3 destinations to make up your mind. Lets say you are looking for all the places to visit in the UK. So you will open the book and straightaway go to the country UK in the index and will quickly jump to that page. Right??

But then you have other criteria as well. You only want to visit the places in London, the zone should be between 1-3. 

Because you do not have any index on City i.e. London and zone i.e. between 1-3.You will reach the first page from where the details about UK starts(Index seek), but after that you will browse all the pages in that range(bookmark lookup). Painful isn't it! Imagine if in the same index you also have a sub index for cities and then within cities another sub index on zones. 

In a database when you create an index on 3 columns viz. Country, City and zone a collective statistics on the index will be created. In that you will see density of all the combinations i.e. (a)Country (b)Country and City (c) country, city and zone. But the histogram will be on the leftmost column i.e. Country because that is the first page in the book that you need to reach to.

That is the reason wise men say "Make sure that your left most column is as unique as possible"

No comments:

Post a Comment