HyperLogLog
HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm This extended abstract describes and analyses a near-optimal probabilistic algorithm, HyperLogLog, dedicated to estimating the number of distinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, “short bytes”), HyperLogLog performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about 1·04/m. This improves on the best previously known cardinality estimator, LogLog, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond 109 with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.
Keywords for this software
References in zbMATH (referenced in 13 articles , 1 standard article )
Showing results 1 to 13 of 13.
Sorted by year (- Rachkovskij, D.A.: Binary vectors for fast distance and similarity estimation (2017)
- Rachkovskij, D.A.: Index structures for fast similarity search for binary vectors (2017)
- Woodruff, David P.; Zhang, Qin: When distributed computation is communication expensive (2017)
- Brandes, Philipp; Kardas, Marcin; Klonowski, Marek; Pająk, Dominik; Wattenhofer, Roger: Approximating the size of a radio network in beeping model (2016)
- Cohen, Reuven; Katzir, Liran; Yehezkel, Aviv: A unified scheme for generalizing cardinality estimators to sum aggregation (2015)
- Fuchs, Michael; Hwang, Hsien-Kuei; Zacharovas, Vytas: An analytic approach to the asymptotic variance of trie statistics and related structures (2014)
- Li, Rong-Hua; Yu, Jeffrey Xu; Huang, Xin; Cheng, Hong; Shang, Zechao: Measuring the impact of MVC attack in large complex networks (2014)
- Csűrös, Miklós: Approximate counting with a floating-point counter (2010)
- Lumbroso, Jérémie: An optimal cardinality estimation algorithm based on order statistics and its full analysis (2010)
- Giroire, Frédéric: Order statistics and estimating cardinalities of massive data sets (2009)
- Nedjar, Sébastien; Casali, Alain; Cicchetti, Rosine; Lakhal, Lotfi: Emerging cubes: borders, size estimations and lossless reductions (2009) ioport
- Chabchoub, Yousra; Fricker, Christine; Meunier, Frédéric; Tibi, Danielle: Analysis of an algorithm catching elephants on the Internet (2008)
- Flajolet, P.; Fusy, É.; Gandouet, O.; Meunier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm (2007)