recently we faced an issue in one of our production boxes.
user complained that, the runtime of the job suddenly increased post a certain date.
we checked the execution plan, data volume processed, spool and many others but everything looked normal except for a change in execution plan.
we investigated the cause of change in execution plan.
we noticed that, statistics for table VTTK got updated and the job started slowing since then.
we then restored stats to the date when the performance is optimal, and issue is fixed.
below are the investigation steps we followed.
1. STAD shows high time on VTTP.
2. we traced the job using ST12.
trace pointed to the long running query on VTTP.
as checked in execution plan screenshot, stats got updated for index VTTK~A at 00:08 on 24.04.2015. and as seen in plan history, the plan got refreshed at 00:16:03 on 24.04.2015.
as the job is of 15 minute frequency, the stats update at 00:08 triggered an execution plan change from 00:16:53 ( for the next job).
in our landscape, we maintain job runtime history of all boxes in a table in a single box for analysis purpose.
below is the screenshot of that, as seen below , runtime of the job increased after the stats update.
as seen, the runtime increased since the stats update.
for reference, we checked a different box with same SQL ID and got a different execution plan, especially the join involving VTTP is a nested loop not hash join.
So ,we concluded that the execution plan got switched after stats update.
we have then tried to restore the old statistics when the performance was good and then job runtime is back to normal range 🙂
Oracle offers two methods to do this – Statistics restore and Statistics export. we used statistics restore method.
we restored the stats using the procedure dbms_stats.restore_table_stats.
before restoring we checked the available stats timestamps in DBA view DBA_TAB_STATS_HISTORY.
and chose the timestamp at which performance was good.
below is the job runtime history after the stats restore.