WebGraph

WebGraph is a framework for graph compression aimed at studying web graphs. It provides simple ways to manage very large graphs, exploiting modern compression techniques. More precisely, it is currently made of: A set of flat codes, called ζ codes, which are particularly suitable for storing web graphs (or, in general, integers with power-law distribution in a certain exponent range). The fact that these codes work well can be easily tested empirically, but we also try to provide a detailed mathematical analysis. Algorithms for compressing web graphs that exploit gap compression and referentiation (à la LINK), intervalisation and ζ codes to provide a high compression ratio (see our datasets). The algorithms are controlled by several parameters, which provide different tradeoffs between access speed and compression ratio. Algorithms for accessing a compressed graph without actually decompressing it, using lazy techniques that delay the decompression until it is actually necessary. Algorithms for analysing very large graphs, such as HyperBall, which has been used to show that Facebook has just four degrees of separation. A complete, documented implementation of the algorithms above in Java distributed under the GNU General Public License. Besides a clearly defined API, we also provide several classes tha modify (e.g., transpose) or recompress a graph, so to experiment with various settings. Datasets for very large graph (e.g., a billion of links). These are either gathered from public sources (such as WebBase), or produced by UbiCrawler and BUbiNG. In the end, with WebGraph you can access and analyse very large web graphs. Using WebGraph is as easy as installing a few jar files and downloading a dataset. This makes studying phenomena such as PageRank, distribution of graph properties of the web graph, etc. very easy.


References in zbMATH (referenced in 42 articles )

Showing results 1 to 20 of 42.
Sorted by year (citations)

1 2 3 next

  1. Bringmann, Karl; Keusch, Ralph; Lengler, Johannes: Geometric inhomogeneous random graphs (2019)
  2. Joana M. F. da Trindade, Konstantinos Karanasos, Carlo Curino, Samuel Madden, Julian Shun: Kaskade: Graph Views for Efficient Graph Analytics (2019) arXiv
  3. Pothen, Alex; Ferdous, S. M.; Manne, Fredrik: Approximation algorithms in combinatorial scientific computing (2019)
  4. Shen, Zhao-Li; Huang, Ting-Zhu; Carpentieri, Bruno; Wen, Chun; Gu, Xian-Ming; Tan, Xue-Yuan: Off-diagonal low-rank preconditioner for difficult PageRank problems (2019)
  5. van der Hoorn, Pim; Olvera-Cravioto, Mariana: Typical distances in the directed configuration model (2018)
  6. Bringmann, Karl; Keusch, Ralph; Lengler, Johannes: Sampling geometric inhomogeneous random graphs in linear time (2017)
  7. Khan, Kifayat Ullah; Dolgorsuren, Batjargal; Anh, Tu Nguyen; Nawaz, Waqas; Lee, Young-Koo: Faster compression methods for a weighted graph using locality sensitive hashing (2017)
  8. Lamm, Sebastian; Sanders, Peter; Schulz, Christian; Strash, Darren; Werneck, Renato F.: Finding near-optimal independent sets at scale (2017)
  9. Mania, Horia; Pan, Xinghao; Papailiopoulos, Dimitris; Recht, Benjamin; Ramchandran, Kannan; Jordan, Michael I.: Perturbed iterate analysis for asynchronous stochastic optimization (2017)
  10. Navlakha, Saket: Learning the structural vocabulary of a network (2017)
  11. Riondato, Matteo; García-Soriano, David; Bonchi, Francesco: Graph summarization with quality guarantees (2017)
  12. Shen, Zhao-Li; Huang, Ting-Zhu; Carpentieri, Bruno; Gu, Xian-Ming; Wen, Chun: An efficient elimination strategy for solving PageRank problems (2017)
  13. Zhao, Junzhou; Wang, Pinghui; Lui, John C. S.; Towsley, Don; Guan, Xiaohong: I/O-efficient calculation of (H)-group closeness centrality over disk-resident graphs (2017)
  14. Akiba, Takuya; Iwata, Yoichi: Branch-and-reduce exponential/FPT algorithms in practice: a case study of vertex cover (2016)
  15. Crespelle, Christophe; Le, Tien-Nam; Perrot, Kevin; Phan, Thi Ha Duong: Linearity is strictly more powerful than contiguity for encoding graphs (2016)
  16. Firmani, Donatella; Georgiadis, Loukas; Italiano, Giuseppe F.; Laura, Luigi; Santaroni, Federico: Strong articulation points and strong bridges in large scale graphs (2016)
  17. Fischer, Johannes; Peters, Daniel: GLOUDS: representing tree-like graphs (2016)
  18. Hager, William W.; Phan, Dzung T.; Zhu, Jiajie: Projection algorithms for nonconvex minimization with application to sparse principal component analysis (2016)
  19. Lagraa, Sofiane; Seba, Hamida: An efficient exact algorithm for triangle listing in large graphs (2016)
  20. Slota, George M.; Madduri, Kamesh; Rajamanickam, Sivasankaran: Complex network partitioning using label propagation (2016)

1 2 3 next