Are you nuts? Index creation order stipulates optimizer index choice???
Some time ago I have been involved in index optimization exercise: applications change, queries evolve, sometimes there is a need to change existing indices – or adjust them a bit – to embrace the changes. Sounds like a routine stuff to me. Sounds like little to be worried about. Drop the old index. Build the new index instead. Neat and easy.
So I thought – before getting alerted by someone that when you re-create indices it is important to preserve index creation order to ensure it does not influence index selection by the optimizer.
My initial reaction was (yes) – are you nuts? How in the world index creation order (traced by indid) may possibly influence optimizer? If it indeed does index maintenance becomes a complete nightmare. Consider you have 10 indices on a table and you have to change structure for index with indid = 2. This will mean rebuilding all indices with indid > 2 just because index with indid = 2 is outdated! Large table? That sucks!
To my knowledge, the only influence indid has on optimizer is in selection of the indices that qualify for a query.
If you remember, when you trace optimizer’s index selection (old good trace flag 302) you can see that for each table optimizer runs through all the qualifying indices in indid order. So you’ll see something like this:
Beginning selection of qualifying indexes for table ‘X’,
Estimating selectivity of index ‘X.I1’, indid N
…
Estimating selectivity of index ‘X.I2’, indid N+M
…
All the rest of optimizer choices are based on selectivity, cluster ratio, &c. Indid bearing stops in the initial order of inspecting qualifying indices.
And yet – I can’t stop thinking that if someone has faced situation when index creation order has influenced the way optimizer approached a query there should be something in it. I’ve tried playing with SARG quality (giving the optimizer a chance to blunder over values it has to guess for indices with better selectivity that have been created last). I’ve tried playing with optimization timeout. Nothing seems to have helped: index selection has been based on selectivity &c. Sounded like a pretty strange bug to me.
Did any of you have faced this in your experience? Everyone has experienced optimizer making wrong choice (either because stats are bad, or because – it just happened to work strange + case opened). Otherwise why should there be a trace flag to force optimizer ignore explicit index forcing. But indid??? I’d be (unpleasantly) surprised to hear of it. At least for one customer it turned index maintenance into a pure nightmare. Are there more around?
Thanks.
Andrew
There are instances when the index id determines index choice. It used to happen a lot more with versions earlier than 15.7 but improvements have been made since then.
In earlier 15.x it was almost entirely cost based so if you had access via two indices and they were costed identically then the index with the lower index id would be selected purely as it was evaluated first. It would take no notice of the relevant index structure and *potential* difference in performance should the runtime criteria for the plan change. The main area this comes into play is during optimization of queries that return no results, i.e. optimization when tables are empty or dummy predicate values.
If you create an empty table with 3 columns and two indices that have (c1,c2,c3) and (c2,c1,c3) and search on c3 you'll find which index it picks will be determined by index id.
Before SP130, it didn't even differentiate between a matching and non-matching index scan if the costs were the same. They finally fixed this by adding a heuristic, cr748569 (enabled under ase157sp130). Before this even if you had an index with c3 as lead column it would pick the plan with the full index scan above the matching scan if the estimate cost were the same (and its index id were lower).
The table order in the query also can effect plan choice (but it isn't really supposed to). I have stumbled upon maybe 10 or 15 instances when a plan will flip dependent on the declared table order. There are multiple reasons for this but the main ones tend to be:
Thanks Simon. I was reported that there has been a case when indid has overridden the cost completely. To me this sounds unthinkable. Wanted to know if anyone else faced this stuff.
Andrew,
Sorry about taking so long to reply, but I have been busy with infrastructure problems. Here goes my two cents. I hope they help.
1. The optimizer can make incorrect choices if the statistics are not up to date on the
table´s indexes.
2. Having said that, when dropping and re-creating indexes, I would ALWAYS create
the CLUSTERED index first, then all remaining indexes.
3. Finally the only times when re-creating indexes in a different order MIGHT create a
problem are WHEN:
A) Queries, Stored Procedures, Applications, executing in-line SQL influence the
optimizer by FORCING index choice, usually by number.
E.g.
-- This would use the FIRST non clustered index
SELECT * FROM pubs2..authors (indid 1)
GO
IF the indexes are not recreated in the same order they were created at the time
the query compiled, the correct index may not be picked.
B) Abstract plans have been captured and associated with the query.
As in A), all objects names will have been boiled down to their ids, and
the index id may have changed.
I hope this makes sense to ya´ll.
Regards,
Jean-Pierre
I wonder who is the brave coder to hint by indid. I'd be a bit cautious with this approach 🙂 ...