Making good on my ‘exercise for the reader’…
(originally published at www.sybase.com in May 2010; moved to SCN in December 2012)
A few months ago, I wrote an article for the Q4, 2009 edition of the ISUG Technical Journal, on the topic of “Decoding Captured ‘Missing Statistics’ In ASE 15.0.3 ESD#1” (if you cannot find back your copy of the ISUG Tech Journal, the article is also available here).
The article describes a neat application of a SQL User-Defined function to decode binary-packed column IDs in the sysstatistics table, as part of a new feature to identify ‘missing statistics’. See the article for full details.
One of the things I mentioned in the article (on page 8, to be precise) was that the SQL code could be enhanced to determine whether the columns for which statistics are reported ‘missing’, are indexed or not. This is relevant since missing statistics could be caused by the column not being indexed at all, and that might be something worth looking into in the context of performance analysis.
I’m sure most readers have figured it out already, but for those who didn’t have time, here is the solution. By some small modifications to both the SQL UDF and the stored procedure which calls the UDF, the output now looks like this (the parts indicated in red are new; the last two columns are left off for readability):
1> my_db..sp_decode_missing_stats
2> go
Dbname Tabname NrRows ColumnList [...]
------ ----------- -------- -------------------------------------- [...]
mydb my_table 7800 col_a(not indexed!) [...]
mydb your_table 5 col_b(ncix3:col#2), col_c(ncix3:col#1) [...]
mydb other_table 54998666 col_d(cix:col#4) [...]
This output indicates the following:
- Column my_table.col_a, for which no histogram was found, is not part of any index. That could be fine, but it would be interesting to check out query plans involving this column to see if adding an index on this column, or adding the column to an existing index, would lead to improvements.
- Column your_table.col_b is the second column in the index named ncix3 (which, judging by the name is likely a non-clustered index). Column your_table.col_c is the first column in that index. No density statistics were found for the combination of these columns. Given the reported order, it might be worth creating an index on (col_b, col_c) and see if that affects the query plans in a positive way.
- Column other_table.col_d, for which no histogram was found, is the fourth column of what might be a clustered index. Given that the table is large, it would be recommended it create histograms on this column.
Note that when a column is part of multiple indexes, the index is shown where the column is placed most towards the beginning of the column order.
As you can see, combining the missing statistics with information about indexing provides some interesting starting points for further analysis.
The full SQL source code is below. The lines in red were added since the ISUG Tech Journal article, and perform the function of figuring out whether a column is indexed or not. There are some interesting things to remark about this code — see below:
1 use sybsystemprocs
2 go
3 create table #ixcols
4 (id int,
5 ixname longsysname,
6 colname longsysname,
7 n int)
8 go
9 create function sp_decode_colidarray
10 @colidarray varbinary(100), @id int
11 returns varchar (1500)
12 as
13 begin
14 declare @s varchar (1500) -- assuming this is long enough for the result
15 declare @len int, @colid int, @colname longsysname
16 declare @ixname longsysname, @n int, @ixcol longsysname
17
18 if datalength(@colidarray)%2 = 1
19 begin
20 set @colidarray = @colidarray + 0x00
21 end
22
23 set @len = 1
24 while @len < datalength(@colidarray)
25 begin
26 set @colid = convert(smallint, substring(@colidarray, @len, 2))
27 set @colname = col_name(@id, @colid)
28
29 -- determine which index this column is part of
30 select top 1 @n = n, @ixname = ixname
31 from #ixcols
32 where id = @id
33 and colname = @colname
34 order by id, colname, n
35
36 if @@rowcount = 0
37 set @ixcol = "not indexed!"
38 else
39 set @ixcol = @ixname + ',col#' + convert(varchar,@n)
40
41 set @s = @s + case @s when NULL then NULL else ', ' end
42 + @colname + '(' + @ixcol + ')'
43 set @len = @len + 2
44 end
45 return @s
46 end
47 go
48 drop table #ixcols
49 go
50
51
52 create proc sp_decode_missing_stats
53 @tabname varchar (100) = '%'
54 as
55 begin
56 declare @n int
57
58 select top 31 n=identity(int) into #n from syscolumns
59
60 -- Get a list of all indexed columns for the qualifiying table(s);
61 -- this list will be used inside the SQL function 'sp_decode_colidarray'
62 -- Note how a Cartesian product is deliberately used for table #n
63 select
64 id=o.id,
65 ixname=i.name,
66 colname=index_col(o.name, i.indid, #n.n, o.uid),
67 #n.n
68 into #ixcols
69 from sysindexes i, #n, sysstatistics s, sysobjects o
70 where s.id = o.id
71 and o.id = i.id
72 and o.type = "U"
73 and i.indid > 0 and i.indid < 255 /* this line updated June 23rd */
74
75 select
76 DBname = db_name(),
77 Tabname = object_name(s.id),
78 NrRows = row_count(db_id(), s.id),
79 ColumnList = dbo.sp_decode_colidarray(colidarray, s.id),
80 Captured = moddate,
81 Occurs = convert (smallint,c0)
82 into #missing
83 from sysstatistics s, sysobjects o
84 where s.id = o.id
85 and o.name like @tabname
86 and formatid = 110
87 and datalength(colidarray) > 0
88
89 exec sp_autoformat #missing,@orderby='order by 2'
90 end
91 go
First, let’s look at lines 60-73. To determine whether a column is part of an index, you must use the built-in function index_col() (line 66). Lines 60-73 essentially make a list of all indexed column for the qualifying tables by calling the function for the maximum of 31 indexed columns in every existing index (and immediately filtering out those which are NULL, meaning the index doesn’t contain that many columns). This list is placed in a #temporary table named #ixcols, generated by the stored procedure sp_decode_missing_stats.
Note how we deliberately use a Cartesian product on table #n to loop over the maximum of 31 indexed columns. Also note how this is based on putting sequence numbers 1..31 in a table on line 58 (we’re using syscolumns here since this table is guaranteed to always contain at least 31 rows).
Next, in the SQL UDF sp_decode_colidarray, we search through table #ixcols for every column found (lines 29-39), and we append the index name and column position (line 42). That’s pretty much it.
Note, though, how the combination of ‘select top 1’ on line 30 and the order-by clause on line 34 will get us the index where this column is placed most towards the beginning of the column order, in case the column is present in multiple indexes.
One last thing is worth noting. The table #ixcols must exist before the SQL UDF can be created (lines 3-7; otherwise you’ll get an error saying the table #ixcols is not found); and it must be dropped again directly afterwards (line 16). For stored procedures, this requirement has been relaxed by the introduction of the new feature ‘deferred name resolution’ in ASE 15.0.3 ESD#1. But unfortunately, that feature does not apply to SQL UDFs so you’ll still have to make sure the table is temporarily created and dropped in order to create the UDF.