Despite our efforts at making relational database systems such as SQL Anywhereself-managing, self-tuning, and self-healing, there remains the need to be able to diagnose and repair performance problems. In part, this requirement is due to the overall complexity of the optimization task. Query optimization is – still – an NP-hard problem and the input to the optimization process includes various heuristics, particularly predicate selectivity estimation. However, performance problems are also due to the increasingly complex hardware environment that characterize computing today. In this article, I want to highlight two recent papers that attempt to diagnose problems of this kind.
The first two articles [1,2] are work jointly authored by researchers at Duke University and IBM Almaden. The papers describe DIADS, a prototype performance diagnostic tool designed to assist in the diagnosis of performance problems when a database server – the authors use Postgres as a test system – utilizes a SAN (Storage Area Network) for its disk resources. The problem with a SAN is that it is a complex, independent system consisting of logical units of disk storage ( pools or volumes) which typically services multiple application/database servers simultaneously. All too often SAN administration is done independently, forcing DBAs to treat a SAN as a “black box”.
Borisov et al. developed a prototype diagnostic system called DIADS which uses annotated plan graphs to illustrate the use of SAN resources with specific query plan operators. Moreover, DIADS uses configuration dependency analysis and symptom signatures to drill down in the detail of specific annotated plans. DIADS has the ability to compare access plans for the same query, and computes metrics based on the deviation from the mean times for each operator. SAN monitoring data, including physical and logical configuration details, component connectivity, configuration changes over time, and DBA-defined events are included in annotated query plans, enabling DBAs to analyze the actual performance characteristics of specific plan operators together with detailed SAN statistics in a single display. DIADS contains a knowledge base as well, enabling expert-system analysis of plan operator performance degradation through the tracking of correlated and dependent plan operators, correlated operator cardinalities, and the inclusion of a symptoms database to help in the analysis of cause versus effect.
The second work is by Goetz Graefe, Harumi Kuno, and Janet Wiener of HP Labs. In their work, the authors study the problem of determining the level of robustness in a query execution engine – that is, the ability of a server to deliver consistent performance across a variety of unexpected run-time conditions: for example, errors in cardinality estimation or resource contention. The authors argue that query execution robustness is as important as the underlying fundamentals of the query operators themselves – and I couldn’t agree more.
The authors’ approach is to describe the robustness of a plan operator visually using plan robustness maps. These two- or three-dimensional graphs can then be used to reason about how particular execution strategies degrade as the amount of work increases or as system resources become constrained:
Reflecting on the visualization techniques employed here, these diagrams enable rapid verification of expected performance, testing of hypotheses, and insight into the absolute and relative performance of alternative query execution plans. Moreover, even for this very simple query, there is a plethora of query execution plans. Investigating many plans over a parameter space with multiple dimensions is possible only with efficient visualizations.
This work provides an interesting perspective of query execution performance that compliments other work that addresses optimization quality, or dynamic re-optimization of SQL requests on-the-fly: that is, the ability of the system’s optimizer to find the optimal plan for a specific set of system parameters.
 Nedyalko Borisov, Shivnath Babu, Sandeep Uttamchandani, Ramani Routray, and Aameek Singh (January 2009). Why did my query slow down?In Proceedings, 4th Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, California.
 Nedyalko Borisov, Shivnath Babu, Sandeep Uttamchandani, Ramani Routray, and Aameek Singh (February 2009). DIADS: Addressing the â€œMy-Problem-or-Yoursâ€ Syndrome with Integrated SAN and Database Diagnosis. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09), San Francisco, California.
 Goetz Graefe, Harumi Kuno, and Janet L. Wiener (January 2009). Visualizing the robustness of query execution. In Proceedings, 4th Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, California.