Abstract
Original language | English |
---|---|
Title of host publication | Proceedings 2018 IEEE International Conference on Big Data, Big Data 2018 |
Publisher | IEEE Institute of Electrical and Electronic Engineers |
Pages | 3784-3792 |
Number of pages | 9 |
ISBN (Electronic) | 978-1-5386-5035-6 |
ISBN (Print) | 978-1-5386-5036-3, 978-1-5386-5034-9 |
DOIs | |
Publication status | Published - 22 Jan 2019 |
MoE publication type | A4 Article in a conference publication |
Event | Advances in High Dimensional Big Data: Workshop in conjunction with the 2018 IEEE International Conference on Big Data (IEEE Big Data 2018) - Seattle, United States Duration: 10 Dec 2018 → 13 Dec 2018 |
Publication series
Series | Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018 |
---|
Workshop
Workshop | Advances in High Dimensional Big Data |
---|---|
Country | United States |
City | Seattle |
Period | 10/12/18 → 13/12/18 |
Fingerprint
Keywords
- graph theory
- statistical analysis
- Big Data
- peer-to-peer computing
- partitioning
- approximation algorithms
- stochastic processes
Cite this
}
Analysis of large sparse graphs using regular decomposition of graph distance matrices. / Reittu, Hannu; Leskelä, Lasse; Fiorucci, Marco; Räty, Tomi.
Proceedings 2018 IEEE International Conference on Big Data, Big Data 2018. IEEE Institute of Electrical and Electronic Engineers , 2019. p. 3784-3792 8622118 (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018).Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceedings › Scientific › peer-review
TY - GEN
T1 - Analysis of large sparse graphs using regular decomposition of graph distance matrices
AU - Reittu, Hannu
AU - Leskelä, Lasse
AU - Fiorucci, Marco
AU - Räty, Tomi
PY - 2019/1/22
Y1 - 2019/1/22
N2 - Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using stochastic block models and power-law random graphs, together with analytical considerations for sparse stochastic block models with bounded average degrees.
AB - Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using stochastic block models and power-law random graphs, together with analytical considerations for sparse stochastic block models with bounded average degrees.
KW - graph theory
KW - statistical analysis
KW - Big Data
KW - peer-to-peer computing
KW - partitioning
KW - approximation algorithms
KW - stochastic processes
UR - http://www.scopus.com/inward/record.url?scp=85062642513&partnerID=8YFLogxK
U2 - 10.1109/BigData.2018.8622118
DO - 10.1109/BigData.2018.8622118
M3 - Conference article in proceedings
SN - 978-1-5386-5036-3
SN - 978-1-5386-5034-9
T3 - Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
SP - 3784
EP - 3792
BT - Proceedings 2018 IEEE International Conference on Big Data, Big Data 2018
PB - IEEE Institute of Electrical and Electronic Engineers
ER -