ShapeSearch: A Flexible and Efficient System for Shape-based Exploration of Trendlines

ACM SIGMOD |

Identifying trendline visualizations with desired patterns is a common task during data exploration. Existing visual analytics tools offer limited flexibility, expressiveness, and scalability for such tasks, especially when the pattern of interest is under-specified and approximate. We propose ShapeSearch, an efficient and flexible pattern-searching tool, that enables the search for desired patterns via multiple mechanisms: sketch, natural-language, and visual regular expressions. We develop a novel shape querying algebra, with a minimal set of primitives and operators that can express a wide variety of shape search queries, and design a natural- language and regex-based parser to translate user queries to the algebraic representation. To execute these queries within interactive response times, ShapeSearch uses a fast shape algebra execution engine with query-aware optimizations, and perceptually-aware scoring methodologies. We present a thorough evaluation of the system, including a user study, a case study involving genomics data analysis, as well as performance experiments, comparing against state-of-the-art trendline shape matching approaches-that together demonstrate the usability and scalability of ShapeSearch.