Squeakr: an exact and approximate k-mer counting system. Squeakr is a k-mer-counting and multiset-representation system using the recently-introduced counting quotient filter (CQF) Pandey et al. (2017), a feature-rich approximate membership query (AMQ) data structure. Squeakr is memory-efficient, consuming 1.5X–4.3X less memory than the state-of-the-art. It offers competitive counting performance, in fact, it is faster for larger k-mers, and answers queries about a particular k-mer over an order-of- magnitude faster than other systems. The Squeakr representation of the k-mer multiset turns out to be immediately useful for downstream processing (e.g., De Bruijn graph traversal) because it supports fast queries and dynamic k-mer insertion, deletion, and modification. k-mer counts can be validated by hooking into the C++ level query API. An example query program is also available in ”kmer_query.cc”.
References in zbMATH (referenced in 1 article )
Showing result 1 of 1.
- Pellegrina, Leonardo; Pizzi, Cinzia; Vandin, Fabio: Fast approximation of frequent (k)-mers and applications to metagenomics (2019)