In SQL Server 2005, one can choose to either rebuild the index or
re-organize the index. There are differences in what these commands do
and it is important to understand those differences. At a very simple
level, here is how you would re-organize or rebuild an index:
CREATE TABLE DECIPHER_DATA (COL1 INT, COL2 NVARCHAR(10));
CREATE UNIQUE INDEX DECIPHER_DATA_IND_1 ON DECIPHER_DATA (COL2);
ALTER INDEX DECIPHER_DATA_IND_1 ON DECIPHER_DATA REORGANIZE;
ALTER INDEX DECIPHER_DATA_IND_1 ON DECIPHER_DATA REBUILD;
Here are the differences between the two:
1) Index rebuild works by re-creating the index internally again and when that has been achieved, it drops the existing index where as index reorganize is the process of physically re-organizing the leaf nodes of the index.
2) During the index rebuild process, the statistics are also re-computed – same as when a new index gets created. Reorganize on the other hand does not update the statistics. Reorganize essentially just swaps one page with another and thus does not require free space for this operation like rebuild does. Infact, reorganize can free up some pages as it does the reorg in two phases – compaction and defrag. A reorganize can remove almost all of the logical fragmentation but it cannot necessarily fix extent fragmentation in which the previous and the next extents are physically contiguous.
3) Another point to note is that an index (clustered or non-clustered) cannot be built online if it contains LOB data (text, ntext, varchar(max), nvarchar(max), varbinary(max), image and xml data type columns). The ALTER INDEX…REORGANIZE command shown above is the same as DBCC INDEXDEFRAG but there is one difference. ALTER INDEX…REORGANIZE has some additional features like large objects compaction (LOB_COMPACTION). And this is an online operation.
4) Regarding partitions of an index, if an index has multiple partitions, then you cannot rebuild a single partition online. You can reorganize a single index partition online. If you want to rebuild an index with multiple partitions in an online environment, you need to rebuild the entire index which means rebuilding all the partitions.
So, how frequently should one do the rebuild/reorganize? Like many answers in the IT field, it depends It depends on the fillfactor, it depends upon the amount of the data that is changed between the rebuild/reorganize operations and it depends upon what logical fragmentation value you consider to be the threshold for forcing these operations.
An additional question that was raised by one of our colleagues was whether the statistics on non-indexed columns also get re-computed when a rebuild is done? He was talking about the auto create statistics (the ones that you would have seen with the names like _WA_sys_xxxx) or the ones that are explicitly created by using the create statistics command. If we are rebuilding an index, does it make sense to also rebuild those at the same time especially if there is any co-relation between them? Does that happen automatically upon a rebuild? The answer is no. It cannot happen automatically since the co-relation is not stored anywhere and those statistics are stored separately from those indexes. In SQL Server 2008, there is a DATE_CORRELATION_OPTIMIZATION database SET option which can help improve the performance of those queries in which 2 tables are in an inner join condition and whose date/datetime data-type columns are co-related example: PO_HDR might have ORDER_DATE and PO_DTL might have PACK_DATE, SHIP_DATE, DUE_DATE etc.. I will check to see whether a rebuild in that case forces the re-build on the co-related index as well and if no index exists, whether the stats are re-computed on those co-related columns if this option is on. Will post our results here once we are done with my tests.
CREATE TABLE DECIPHER_DATA (COL1 INT, COL2 NVARCHAR(10));
CREATE UNIQUE INDEX DECIPHER_DATA_IND_1 ON DECIPHER_DATA (COL2);
ALTER INDEX DECIPHER_DATA_IND_1 ON DECIPHER_DATA REORGANIZE;
ALTER INDEX DECIPHER_DATA_IND_1 ON DECIPHER_DATA REBUILD;
Here are the differences between the two:
1) Index rebuild works by re-creating the index internally again and when that has been achieved, it drops the existing index where as index reorganize is the process of physically re-organizing the leaf nodes of the index.
2) During the index rebuild process, the statistics are also re-computed – same as when a new index gets created. Reorganize on the other hand does not update the statistics. Reorganize essentially just swaps one page with another and thus does not require free space for this operation like rebuild does. Infact, reorganize can free up some pages as it does the reorg in two phases – compaction and defrag. A reorganize can remove almost all of the logical fragmentation but it cannot necessarily fix extent fragmentation in which the previous and the next extents are physically contiguous.
3) Another point to note is that an index (clustered or non-clustered) cannot be built online if it contains LOB data (text, ntext, varchar(max), nvarchar(max), varbinary(max), image and xml data type columns). The ALTER INDEX…REORGANIZE command shown above is the same as DBCC INDEXDEFRAG but there is one difference. ALTER INDEX…REORGANIZE has some additional features like large objects compaction (LOB_COMPACTION). And this is an online operation.
4) Regarding partitions of an index, if an index has multiple partitions, then you cannot rebuild a single partition online. You can reorganize a single index partition online. If you want to rebuild an index with multiple partitions in an online environment, you need to rebuild the entire index which means rebuilding all the partitions.
So, how frequently should one do the rebuild/reorganize? Like many answers in the IT field, it depends It depends on the fillfactor, it depends upon the amount of the data that is changed between the rebuild/reorganize operations and it depends upon what logical fragmentation value you consider to be the threshold for forcing these operations.
An additional question that was raised by one of our colleagues was whether the statistics on non-indexed columns also get re-computed when a rebuild is done? He was talking about the auto create statistics (the ones that you would have seen with the names like _WA_sys_xxxx) or the ones that are explicitly created by using the create statistics command. If we are rebuilding an index, does it make sense to also rebuild those at the same time especially if there is any co-relation between them? Does that happen automatically upon a rebuild? The answer is no. It cannot happen automatically since the co-relation is not stored anywhere and those statistics are stored separately from those indexes. In SQL Server 2008, there is a DATE_CORRELATION_OPTIMIZATION database SET option which can help improve the performance of those queries in which 2 tables are in an inner join condition and whose date/datetime data-type columns are co-related example: PO_HDR might have ORDER_DATE and PO_DTL might have PACK_DATE, SHIP_DATE, DUE_DATE etc.. I will check to see whether a rebuild in that case forces the re-build on the co-related index as well and if no index exists, whether the stats are re-computed on those co-related columns if this option is on. Will post our results here once we are done with my tests.
It is really a helpful blog to find some different source to add my knowledge. I came into aware of new professional blog and I am impressed with suggestions of author.
ReplyDeleteHPE SV3200