Microsoft SSIS and Sql Server Notes: Clustered and Non Clustered Indexes

Monday, January 30, 2012

Clustered and Non Clustered Indexes

Clustered Index

A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages. A non-clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non-clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows

Only 1 allowed per table

Physically rearranges the data in the table to conform to the index constraints

For use on columns that are frequently searched for ranges of data

For use on columns with low selectivity

Non-Clustered Index

A nonclustered index is analogous to an index in a textbook.

Non-clustered indexes are created outside of the database table and contain a sorted list of references to the table itself.
However, it’s important to keep in mind that non-clustered indexes slow down the data modification and insertion process, so indexes should be kept to a minimum

Up to 249 allowed per table in SQL 2000,2005 and upto 999 allowed in SQL 2008

Creates a separate list of key values with pointers to the location of the data in the data pages

For use on columns that are searched for single values

For use on columns with high selectivity

In following example column FileName is varchar(400), which will increase the size of the index key bigger than it is allowed. If we still want to include in our cover index to gain performance we can do it by using the Keyword INCLUDE.

USE AdventureWorks
GO
CREATE INDEX IX_Document_Title
ON Production.Document (Title, Revision)
INCLUDE (FileName)

Non-key columns can be included only in non-clustered indexes. Columns can’t be defined in both the key column and the INCLUDE list. Column names can’t be repeated in the INCLUDE list. Non-key columns can be dropped from a table only after the non-key index is dropped first. For Included Column Index to exist there must be at least one key column defined with a maximum of 16 key columns and 1023 included columns.

Avoid adding unnecessary columns. Adding too many index columns, key or non-key as they will affect negatively on performance. Fewer index rows will fit on a page. This could create I/O increases and reduced cache efficiency. More disk space will be required to store the index. Index maintenance may increase the time that it takes to perform modifications, inserts, updates, or deletes, to the underlying table or indexed view.

Another example to test:
Create following Index on Database AdventureWorks in SQL SERVER 2005

USE AdventureWorks
GO
CREATE NONCLUSTERED INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID)
GO

Test the performance of following query before and after creating Index. The performance improvement is significant.

SELECT AddressLine1, AddressLine2, City, StateProvinceID, PostalCode
FROM Person.Address
WHERE PostalCode BETWEEN '98000'
AND '99999';
GO

One of the hardest tasks facing database administrators is the selection of appropriate columns for non-clustered indexes. You should consider creating non-clustered indexes on any columns that are frequently referenced in the WHERE clauses of SQL statements. Other good candidates are columns referenced by JOIN and GROUP BY operations.

You may wish to also consider creating non-clustered indexes that cover all of the columns used by certain frequently issued queries. These queries are referred to as “covered queries” and experience excellent performance gains.
SQL Server provides a wonderful facility known as the Index Tuning Wizard which greatly enhances the index selection process. To use this tool, first use SQL Profiler to capture a trace of the activity for which you wish to optimize performance. You may wish to run the trace for an extended period of time to capture a wide range of activity. Then, using Enterprise Manager, start the Index Tuning Wizard and instruct it to recommend indexes based upon the captured trace. It will not only suggest appropriate columns for queries but also provide you with an estimate of the performance increase you’ll experience after making those changes!

When Non Clustered Indexes should be used:

http://www.sql-server-performance.com/2007/nonclustered-indexes/

Before you create nonclustered indexes, understand how your data will be accessed. Consider using nonclustered indexes for:

Columns that contain a large number of distinct values, such as a combination of last name and first name (if a clustered index is used for other columns). If there are very few distinct values, such as only 1 and 0, most queries will not use the index because a table scan is usually more efficient.

Queries that do not return large result sets.

Columns frequently involved in search conditions of a query (WHERE clause) that return exact matches.

Decision-support-system applications for which joins and grouping are frequently required. Create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on any foreign key columns.

Covering all columns from one table in a given query. This eliminates accessing the table or clustered index altogether.

References: http://blog.sqlauthority.com/2007/04/23/sql-server-understanding-new-index-type-of-sql-server-2005-included-column-index-along-with-clustered-index-and-non-clustered-index/

Index tuning: http://www.brentozar.com/archive/2009/07/tuning-tip-identify-overlapping-indexes/

Covered Index: http://blog.sqlauthority.com/2010/03/09/sql-server-improve-performance-by-reducing-io-creating-covered-index/

Monday, January 30, 2012

Clustered and Non Clustered Indexes

No comments: