PyTorch-BigGraph: A Large-scale Graph Embedding System. Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to traditional multi-relation embedding systems that allow it to scale to graphs with billions of nodes and trillions of edges. PBG uses graph partitioning to train arbitrarily large embeddings on either a single machine or in a distributed environment. We demonstrate comparable performance with existing embedding systems on common benchmarks, while allowing for scaling to arbitrarily large graphs and parallelization on multiple machines. We train and evaluate embeddings on several large social network graphs as well as the full Freebase dataset, which contains over 100 million nodes and 2 billion edges.
Keywords for this software
References in zbMATH (referenced in 4 articles )
Showing results 1 to 4 of 4.
- Ali, Mehdi; Berrendorf, Max; Hoyt, Charles Tapley; Vermue, Laurent; Sharifzadeh, Sahand; Tresp, Volker; Lehmann, Jens: PyKEEN 1.0: a Python library for training and evaluating knowledge graph embeddings (2021)
- Chaoyu Guan, Ziwei Zhang, Haoyang Li, Heng Chang, Zeyang Zhang, Yijian Qin, Jiyan Jiang, Xin Wang, Wenwu Zhu: AutoGL: A Library for Automated Graph Learning (2021) arXiv
- Kazemi, Seyed Mehran; Goel, Rishab; Jain, Kshitij; Kobyzev, Ivan; Sethi, Akshay; Forsyth, Peter; Poupart, Pascal: Representation learning for dynamic graphs: a survey (2020)
- Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue, Sahand Sharifzadeh, Volker Tresp, Jens Lehmann: PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Emebddings (2020) arXiv