HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm This extended abstract describes and analyses a near-optimal probabilistic algorithm, HyperLogLog, dedicated to estimating the number of distinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, “short bytes”), HyperLogLog performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about 1·04/m. This improves on the best previously known cardinality estimator, LogLog, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond 109 with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.

References in zbMATH (referenced in 11 articles , 1 standard article )

Showing results 1 to 11 of 11.
Sorted by year (citations)

  1. Rachkovskij, D.A.: Binary vectors for fast distance and similarity estimation (2017)
  2. Brandes, Philipp; Kardas, Marcin; Klonowski, Marek; Pająk, Dominik; Wattenhofer, Roger: Approximating the size of a radio network in beeping model (2016)
  3. Cohen, Reuven; Katzir, Liran; Yehezkel, Aviv: A unified scheme for generalizing cardinality estimators to sum aggregation (2015)
  4. Fuchs, Michael; Hwang, Hsien-Kuei; Zacharovas, Vytas: An analytic approach to the asymptotic variance of trie statistics and related structures (2014)
  5. Li, Rong-Hua; Yu, Jeffrey Xu; Huang, Xin; Cheng, Hong; Shang, Zechao: Measuring the impact of MVC attack in large complex networks (2014)
  6. Csűrös, Miklós: Approximate counting with a floating-point counter (2010)
  7. Lumbroso, Jérémie: An optimal cardinality estimation algorithm based on order statistics and its full analysis (2010)
  8. Giroire, Frédéric: Order statistics and estimating cardinalities of massive data sets (2009)
  9. Nedjar, Sébastien; Casali, Alain; Cicchetti, Rosine; Lakhal, Lotfi: Emerging cubes: borders, size estimations and lossless reductions (2009) ioport
  10. Chabchoub, Yousra; Fricker, Christine; Meunier, Frédéric; Tibi, Danielle: Analysis of an algorithm catching elephants on the Internet (2008)
  11. Flajolet, P.; Fusy, É.; Gandouet, O.; Meunier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm (2007)