Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Random Graph Model for Structural Analysis of Online Communications

Exactpro
PRO
November 08, 2019

Random Graph Model for Structural Analysis of Online Communications

Ivan Sukharev and Maria Ivanova

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/ACbKae2kugI

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro
PRO

November 08, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Random Graph Model for Structural Analysis
    of Online Communications
    TMPA-2019
    Maria Ivanova
    Ivan Sukharev
    National Research University
    Higher School of Economics (Moscow)
    November 8, 2019

    View Slide

  2. Overview
    1. State of a problem
    2. Graph Definition
    3. Related works
    4. Adaptation of the Barabasi–Albert Growth Model
    5. New Random Graph Model
    6. Model Fitting
    Maria Ivanova Higher School of Economics November 8, 2019 2 / 14

    View Slide

  3. State of a problem
    Some problems:
    1. Processing costs
    2. Disclosure of personal
    information
    3. Statistical reliability
    Maria Ivanova Higher School of Economics November 8, 2019 3 / 14

    View Slide

  4. Graph Definition
    Let Vn be a set of vertices:
    Vn = {1, ..., n}
    Then a set of all edges En for
    Vn is as follows:
    En = {{i, j} | i, j ∈ Vn, i = j}
    A graph is an ordered pair
    G := (Vn, E)
    where E ⊂ En.
    Maria Ivanova Higher School of Economics November 8, 2019 4 / 14

    View Slide

  5. Related works
    Erdos-–Renyi Model
    The graph generation process consists in constructing a set of
    edges E for a given set of vertices Vn. The edge eij ∈ En is in
    the set of edges E of a random graph with probability p ∈ [0, 1].
    Maria Ivanova Higher School of Economics November 8, 2019 5 / 14

    View Slide

  6. Related works
    Scale-free network
    A scale-free network is a graph where the degree distribution of
    the vertices is described by a power-law, at least asymptotically.
    Therefore, the probability of a vertex having k edges at large
    values of k is proportional to k−γ:
    P (k) ∼ k−γ (1)
    Maria Ivanova Higher School of Economics November 8, 2019 6 / 14

    View Slide

  7. Related works
    Barabasi–Albert Growth Model
    A new vertex vn+1
    is added. Then, with probability pi
    , there is
    an edge between the new and the i-th vertices, where pi
    is
    calculated by the following formula:
    pi =
    deg (vi)
    n
    j=1
    deg vj
    (2)
    Maria Ivanova Higher School of Economics November 8, 2019 7 / 14

    View Slide

  8. Data
    We obtained 56003 articles
    with comments and
    constructed the comment
    graph for each of them.
    24% — the number of first
    level comments.
    Ci
    is a comment of ith level
    Maria Ivanova Higher School of Economics November 8, 2019 8 / 14

    View Slide

  9. Adaptation
    of the Barabasi–Albert Growth Model
    The probability of choosing the node is directly proportional to
    the number of edges attached to it. We add some parameter k
    to a root node degree, thereby increasing the likelihood of
    joining it rather than a comment.
    Maria Ivanova Higher School of Economics November 8, 2019 9 / 14

    View Slide

  10. New Random Graph Model
    Growth algorithm:
    1. With probability p, a new vertex joins the root of the tree,
    that is, the article itself. Its weight is recorded by the
    function φ, that is an indicator of interest to this message
    among other users.
    2. With probability of 1 − p, a new vertex joins any vertex at
    random, except for the root of the tree. The probability of
    joining each of them is proportional to their weights. A new
    vertex takes up λ from the weight of the vertex to which it is
    attached.
    Maria Ivanova Higher School of Economics November 8, 2019 10 / 14

    View Slide

  11. Model fitting
    Finding of parameter p
    The value of the first
    parameter p was calculated:
    24% of all the vertices are
    neighbors to the article.
    Maria Ivanova Higher School of Economics November 8, 2019 11 / 14

    View Slide

  12. Model fitting
    Parameter λ explanation
    When λ = 1, an entire weight
    of the vertex will go to the next
    level in the case of joining an
    edge. This indicates
    appearance of long leaves
    without branching.
    When λ = 0, a leaf will not go
    down beyond the second level
    and the nodes degrees in the
    first level will quickly grow.
    Maria Ivanova Higher School of Economics November 8, 2019 12 / 14

    View Slide

  13. Model fitting
    Finding of parameter λ
    The vertex that first joins the
    first-level comment takes up λ
    of its weight.
    The value of this parameter is
    λ = 0.7629.
    Since the subtree’s weight is
    proportional to the number of
    comments at the end of the
    discussion, the distribution of
    the random variable φ could
    be found from the subtree
    comments number.
    Maria Ivanova Higher School of Economics November 8, 2019 13 / 14

    View Slide

  14. Thank you for your attention!
    Random Graph Model for Structural
    Analysis of Online Communications
    Maria Ivanova
    Ivan Sukharev
    National Research University
    Higher School of Economics (Moscow)
    November 8, 2019
    Maria Ivanova Higher School of Economics November 8, 2019 14 / 14

    View Slide