The dynamic optimality conjecture is perhaps the most fundamental open question about binary search trees (BST). It postulates the existence of an asymptotically optimal online BST, i.e. One that is constant factor competitive with any BST on any input access sequence. The two main candidates for dynamic optimality in the literature are splay trees [Sleator and Tarjan, 1985], and Greedy [Lucas, 1988, Munro, 2000, Demaine et al. 2009]. Despite BSTs being among the simplest data structures in computer science, and despite extensive effort over the past three decades, the conjecture remains elusive. Dynamic optimality is trivial for almost all sequences: the optimum access cost of most length-n sequences is Theta(n log n), achievable by any balanced BST. Thus, the obvious missing step towards the conjecture is an understanding of the 'easy' access sequences, and indeed the most fruitful research direction so far has been the study of specific sequences, whose 'easiness' is captured by a parameter of interest. For instance, splay provably achieves the bound of O(nd) when d roughly measures the distances between consecutive accesses (dynamic finger), the average entropy (static optimality), or the delays between multiple accesses of an element(working set). The difficulty of proving dynamic optimality is witnessed by other highly restricted special cases that remain unresolved, one prominent example is the traversal conjecture [Sleator and Tarjan, 1985], which states that preorder sequences (whose optimum is linear) are linear-time accessed by splay trees, no online BST is known to satisfy this conjecture. In this paper, we prove two different relaxations of the traversal conjecture for Greedy: (i) Greedy is almost linear for preorder traversal, (ii) if a linear-time preprocessing is allowed, Greedy is in fact linear. These statements are corollaries of our more general results that express the complexity of access sequences in terms of a pattern avoidance parameter k. Pattern avoidance is a well-established concept in combinatorics, and the classes of input sequences thus defined are rich, e.g. The k = 3 case includes preorder sequences. For any sequence X with parameter k, our most general result shows that Greedy achieves the cost n2α(n)O(k) where α is the inverse Ackermann function. Furthermore, a broad subclass of parameter-k sequences has a natural combinatorial interpretation as k-decomposable sequences. For this class of inputs, we obtain an n∗2O(k) bound for Greedy when preprocessing is allowed. For k = 3, these results imply (i) and (ii). To our knowledge, these are the first upper bounds for Greedy that are not known to hold for any other online BST. To obtain these results we identify an input-revealing property of Greedy. Informally, this means that the execution log partially reveals the structure of the access sequence. This property facilitates the use of rich technical tools from forbidden sub matrix theory. Further studying the intrinsic complexity of k-decomposable sequences, we make several observations. First, in order to obtain an offline optimal BST, it is enough to bound Greedy on non-decomposable access sequences. Furthermore, we show that the optimal cost for k-decomposable sequences is Theta(n log k), which is well below the proven performance of all known BST algorithms. Hence, sequences in this class can be seen as a 'candidate counterexample' to dynamic optimality. © 2015 IEEE.
Institute of Electrical and Electronics Engineers (IEEE), 2015. 410-423 p.
56th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2015, 17 October 2015 through 20 October 2015