KABOOM! A new suffix array based algorithm for clustering expression data. Results: We introduce a new filter for string similarity which has the potential to eliminate the need for all-versus-all comparison in clustering of expression data and other similar tasks. Our filter is based on multiple long exact matches between the two strings, with the additional constraint that these matches must be sufficiently far apart. We give details of its efficient implementation using modified suffix arrays. We demonstrate its efficiency by presenting our new expression clustering tool, wcd-express, which uses this heuristic. We compare it to other current tools and show that it is very competitive both with respect to quality and run time. Availability: Source code and binaries available under GPL at http://code.google.com/p/wcdest. Runs on Linux and MacOS X.

References in zbMATH (referenced in 1 article )

Showing result 1 of 1.
Sorted by year (citations)

  1. Sunita; Garg, Deepak: Extended suffix array construction using Lyndon factors (2018)