AFContainer

Efficiently inferring pairwise subtree prune-and-regraft adjacencies between phylogenetic trees. We develop a time-optimal (O(mn^2))-time algorithm to construct the subtree prune-regraft (SPR) graph on a collection of (m) phylogenetic trees with (n) leaves. This improves on the previous bound of (O(mn^3)). Such graphs are used to better understand the behaviour of phylogenetic methods and recommend parameter choices and diagnostic criteria. The limiting factor in these analyses has been the difficulty in constructing such graphs for large numbers of trees. We also develop the first efficient algorithms for constructing the nearest-neighbor interchange (NNI) and tree bisection-and-reconnection (TBR) graphs. These new algorithms are enabled by a change of perspective: rather than focusing on the trees and checking for pairs of adjacencies, we enumerate the potential adjacencies themselves in the form of structures called “agreement forests.” Indeed, two trees are adjacent in the graph if, and only if, they share an appropriately defined two-component agreement forest. We prove that this holds even in the case of unrooted trees, the first such result for unrooted SPR. To turn this observation into an efficient algorithm, we develop two tools: SDLNewick, the first unique string representation for agreement forests, and a new AFContainer data structure which efficiently stores tree adjacencies using such strings.