Plan Bouquets: Query Processing without Selectivity Estimation
Selectivity estimates for optimizing OLAP queries often differ significantly from those actually encountered during query execution, leading to poor plan choices and inflated response times. We propose here a conceptually new approach to address this problem, wherein the compile-time estimation process is completely eschewed for error-prone selectivities. Instead, a small “bouquet” of plans is identified from the set of optimal plans in the query’s selectivity error space, such that at least one among this subset is near-optimal at each location in the space. Then, at run time, the actual selectivities of the query are incrementally “discovered” through a sequence of partial executions of bouquet plans, eventually identifying the appropriate bouquet plan to execute. The duration and switching of the partial executions is controlled by a graded progression of isocost surfaces projected onto the optimal performance profile. We prove that this construction results in bounded overheads for the selectivity discovery process and consequently, guaranteed worst-case performance. In addition, it provides repeatable execution strategies across different invocations of a query. The plan bouquet approach has been empirically evaluated on both PostgreSQL and a commercial DBMS, over the TPC-H and TPC-DS benchmark environments. Our experimental results indicate that, even with conservative assumptions, it delivers substantial improvements in the worst-case behavior, without impairing the average-case performance, as compared to the native optimizers of these systems. Moreover, the bouquet technique can be largely implemented using existing optimizer infrastructure, making it relatively easy to incorporate in current database engines. Overall, the bouquet approach provides novel guarantees that open up new possibilities for robust query processing.