Analysis of large sparse graphs using regular decomposition of graph distance matrices

Hannu Reittu, Lasse Leskelä, Marco Fiorucci, Tomi Räty

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    1 Citation (Scopus)
    25 Downloads (Pure)

    Abstract

    Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using stochastic block models and power-law random graphs, together with analytical considerations for sparse stochastic block models with bounded average degrees.
    Original languageEnglish
    Title of host publication2018 IEEE International Conference on Big Data (Big Data)
    PublisherIEEE Institute of Electrical and Electronic Engineers
    Pages3784-3792
    ISBN (Electronic)978-1-5386-5035-6
    ISBN (Print)978-1-5386-5036-3, 978-1-5386-5034-9
    DOIs
    Publication statusPublished - 22 Jan 2019
    MoE publication typeA4 Article in a conference publication
    EventAdvances in High Dimensional Big Data: Workshop in conjunction with the 2018 IEEE International Conference on Big Data (IEEE Big Data 2018) - Seattle, United States
    Duration: 10 Dec 201813 Dec 2018

    Workshop

    WorkshopAdvances in High Dimensional Big Data
    CountryUnited States
    CitySeattle
    Period10/12/1813/12/18

    Fingerprint

    Decomposition
    Structural properties
    Statistical methods

    Keywords

    • graph theory
    • statistical analysis
    • Big Data
    • peer-to-peer computing
    • partitioning
    • approximation algorithms
    • stochastic processes

    Cite this

    Reittu, H., Leskelä, L., Fiorucci, M., & Räty, T. (2019). Analysis of large sparse graphs using regular decomposition of graph distance matrices. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 3784-3792). [8622118] IEEE Institute of Electrical and Electronic Engineers . https://doi.org/10.1109/BigData.2018.8622118
    Reittu, Hannu ; Leskelä, Lasse ; Fiorucci, Marco ; Räty, Tomi. / Analysis of large sparse graphs using regular decomposition of graph distance matrices. 2018 IEEE International Conference on Big Data (Big Data). IEEE Institute of Electrical and Electronic Engineers , 2019. pp. 3784-3792
    @inproceedings{cf0df62db8a04fc4a7447dea6468cbfd,
    title = "Analysis of large sparse graphs using regular decomposition of graph distance matrices",
    abstract = "Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using stochastic block models and power-law random graphs, together with analytical considerations for sparse stochastic block models with bounded average degrees.",
    keywords = "graph theory, statistical analysis, Big Data, peer-to-peer computing, partitioning, approximation algorithms, stochastic processes",
    author = "Hannu Reittu and Lasse Leskel{\"a} and Marco Fiorucci and Tomi R{\"a}ty",
    year = "2019",
    month = "1",
    day = "22",
    doi = "10.1109/BigData.2018.8622118",
    language = "English",
    isbn = "978-1-5386-5036-3",
    pages = "3784--3792",
    booktitle = "2018 IEEE International Conference on Big Data (Big Data)",
    publisher = "IEEE Institute of Electrical and Electronic Engineers",
    address = "United States",

    }

    Reittu, H, Leskelä, L, Fiorucci, M & Räty, T 2019, Analysis of large sparse graphs using regular decomposition of graph distance matrices. in 2018 IEEE International Conference on Big Data (Big Data)., 8622118, IEEE Institute of Electrical and Electronic Engineers , pp. 3784-3792, Advances in High Dimensional Big Data, Seattle, United States, 10/12/18. https://doi.org/10.1109/BigData.2018.8622118

    Analysis of large sparse graphs using regular decomposition of graph distance matrices. / Reittu, Hannu; Leskelä, Lasse; Fiorucci, Marco; Räty, Tomi.

    2018 IEEE International Conference on Big Data (Big Data). IEEE Institute of Electrical and Electronic Engineers , 2019. p. 3784-3792 8622118.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    TY - GEN

    T1 - Analysis of large sparse graphs using regular decomposition of graph distance matrices

    AU - Reittu, Hannu

    AU - Leskelä, Lasse

    AU - Fiorucci, Marco

    AU - Räty, Tomi

    PY - 2019/1/22

    Y1 - 2019/1/22

    N2 - Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using stochastic block models and power-law random graphs, together with analytical considerations for sparse stochastic block models with bounded average degrees.

    AB - Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using stochastic block models and power-law random graphs, together with analytical considerations for sparse stochastic block models with bounded average degrees.

    KW - graph theory

    KW - statistical analysis

    KW - Big Data

    KW - peer-to-peer computing

    KW - partitioning

    KW - approximation algorithms

    KW - stochastic processes

    UR - http://www.scopus.com/inward/record.url?scp=85062642513&partnerID=8YFLogxK

    U2 - 10.1109/BigData.2018.8622118

    DO - 10.1109/BigData.2018.8622118

    M3 - Conference article in proceedings

    SN - 978-1-5386-5036-3

    SN - 978-1-5386-5034-9

    SP - 3784

    EP - 3792

    BT - 2018 IEEE International Conference on Big Data (Big Data)

    PB - IEEE Institute of Electrical and Electronic Engineers

    ER -

    Reittu H, Leskelä L, Fiorucci M, Räty T. Analysis of large sparse graphs using regular decomposition of graph distance matrices. In 2018 IEEE International Conference on Big Data (Big Data). IEEE Institute of Electrical and Electronic Engineers . 2019. p. 3784-3792. 8622118 https://doi.org/10.1109/BigData.2018.8622118