Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mining Online Social Data for Detecting Social ...

Lee Wei
November 18, 2016

Mining Online Social Data for Detecting Social Network Mental Disorders

This is the slide for reading "Mining Online Social Data for Detecting Social Network Mental Disorders" http://www2016.net/proceedings/proceedings/p275.pdf

Lee Wei

November 18, 2016
Tweet

More Decks by Lee Wei

Other Decks in Research

Transcript

  1. Mining Online Social Data for Detecting Social Network Mental Disorder

    Advisor: Kun-Ta Chuang Presenter: Wei Lee g Online Social Data for Detecting Social Network Mental Disorders Hong-Han Shuai1, Chih-Ya Shen1, De-Nian Yang1, Yi-Feng Lan2, Wang-Chien Lee3, Philip S. Yu4,5, Ming-Syan Chen6 1Academia Sinica, Taiwan 2Tamkang University, Taiwan 3The Pennsylvania State University, USA 4University of Illinois at Chicago, USA 5Tsinghua University, China 6National Taiwan University, Taiwan [email protected], [email protected], [email protected] carolyfl[email protected], [email protected], [email protected], [email protected] mber of social network mental disorders (SN- yber-Relationship Addiction, Information Over- professor of the Social Studies of Science and Technology in MIT.1 With the explosive growth in popularity of social net- working and messaging apps, online social networks (OSNs) have become a part of many people’s daily lives. While OSNs
  2. Current Research on OSN • Mostly related to • Improving

    people’s life • Less on • Remedy the problems incurred from OSN
  3. Proposed Model • SNMDD
 (Social Network Mental Disorder Detection) •

    Exploits features extracted from social network • STM
 (SNMD-based Tensor Model) • Multi-source Learning • Improve SNMDD Performance
  4. Why these models? • Does not reply on self-revealing •

    Multi-source Learning • User may behave differently on different OSNs
  5. More about SNMDD • Semi-Supervised classification • SVM based •

    Transductive SVM • Ground Truth • Via Current Diagnostic Practice in Psychology
  6. Feature Extraction 1. Lack of mental features 2. Heavy user

    vs addictive user 3. Multi-source learning SNMDD STM
  7. Social Interaction Features • Parasocial Relationship (PR) • Online and

    Offline Interaction Ratio (ONOFF) • Social Capital (SC) • Social Searching vs. Browsing (SBB)
  8. Parasocial Relationship
 (PR) |a out |/|a in | • The

    ratio between number of actions from you and number of actions to you • Action • e.g. Likes, Comments, Posts • Loneliness is one of the primary reason of SNMD
  9. Online and Offline Interaction Ratio (ONOFF) |a on |/|a off

    | • The ration between online actions and offline actions • Offline Actions • e.g. Checking logs, “Going” Events • SNMD people tends to snub their friends in real life
  10. Social Capital (SC) 1. Bound Strengthening • Strength Relationship •

    Related to CR 2. Information Seeking • Find Valuable Information • Related to IO n strong nweak
  11. Social Capital (SC) • Proxy Feature • Number of friend

    user interact
 (e.g. Like, Post, Comments) • Number of all friends n strong nweak
  12. Social Searching vs. Browsing (SBB) • SS: Actively reading news

    feed from friends' walls • SS creates more pleasure than SB • SS is more likely to be a drug reward
 (Possible SNMD) • SB: Passively reading personal news feeds
  13. Social Searching vs. Browsing (SBB) n1 P1 i=2 ni n1

    1 X i=2 ni ni The total number of the i-th action among friends post Social Searching Social Browsing Related to CR Related to IO
  14. Personal Features • Self-disclosure Based Features (SD) • Temporal Behavior

    Features (TEMP) • Usage Time (UT) • Disinhibition Based Features (DIS) • Profile Features (PROF)
  15. Self-disclosure Based Features (SD) • Self-disclosure stimulate the brains's pleasure

    center Numbers of emoticons, stickers, selfies in each posts
  16. Temporal Behavior Features (TEMP) • Kleinberg’s burst detection algorithm
 •

    Relapse (Burst Intensity) • Tolerance (Burst Length) ⟨avg, median, std, max, min⟩ of burst intensity and burst length
  17. Usage Time (UT) 1. Duration 2. Number of Online States

    • Use relapse and tolerance to distinguish users with SNMD and heavy user • e.g. Standard deviation is smaller for heavy users
  18. Disinhibition Based Features (DIS) • Disinhibition is also one of

    the primary reasons of SNMD • Related to CR, NC Average Cluster Coefficient
  19. Disinhibition Based Features (DIS) Anonymous OSNs have a stronger disinhibition

    effect We are using non-anonymous OSNs Disinhibition effect on non-anonymous is hard to detect
  20. Disinhibition Based Features (DIS) Average Cluster Coefficient (CC) on anonymous

    OSNs is smaller than on non-anonymous ones Smaller Average CC Anonymous OSNs Stronger Disinhibition Effect
  21. Profile Features (PROF) • User Profile • e.g. age, gender

    • e.g. • Female: Online communication • Male: Follow news and play • The number of game posts is extracted as feature
  22. STM

  23. Multi-Source Learning • Naive Way • Concatenate the features from

    different OSNs • Why naive way is bad? • Miss the correlation in different OSNs • Might introduce interference
  24. Tucker Decomposition • Tesnor τ • D SNMD features •

    N Users • M OSN sources • Extract a latent feature matrix U • Used To estimate deficit features • From Corresponding features of other OSNs • From Other users with similar behavior element (i, j, k) of a three-mode tensor T is denoted by tijk , whereas the i-th row and the j-th column of a two-dimensional matrix U are respectively denoted by ui: and u:j . Specifically, Tucker decomposition [31] of a tensor T ∈ RN×D×M is defined as: T = C ×1 U ×2 V ×3 W, (1) where U ∈ RN×R, V ∈ RD×S and W ∈ RM×T are latent matrices. In this paper, the matrix of users’ latent features U plays a crucial role. In Tucker decomposition, R, S, and T are parameters to be set according to different criteria [31]. The 1-mode product of C ∈ RR×S×T and U ∈ RN×R, denoted by C ×1 U, is a ma- trix with size N × S × T, where each element (C ×1 U) nst = R r=1 crst urn . Given the input tensor matrix T that consists of the features of all users from every OSN, Tucker decompo- sition derives C, U, V, and W to meet the above equality on while vectors are denoted by boldface lowercase letters, e.g., u. Matrices are represented by boldface capital letters, e.g., U, and tensors are denoted by calligraphic letters, e.g., T . Each element (i, j, k) of a three-mode tensor T is denoted by tijk , whereas the i-th row and the j-th column of a two-dimensional matrix U are respectively denoted by ui: and u:j . Specifically, Tucker decomposition [31] of a tensor T ∈ RN×D×M is defined as: T = C ×1 U ×2 V ×3 W, (1) where U ∈ RN×R, V ∈ RD×S and W ∈ RM×T are latent matrices. In this paper, the matrix of users’ latent features U plays a crucial role. In Tucker decomposition, R, S, and T are parameters to be set according to different criteria [31]. The 1-mode product of C ∈ RR×S×T and U ∈ RN×R, denoted by C ×1 U, is a ma- trix with size N × S × T , where each element (C ×1 U) nst = R r=1 crst urn . Given the input tensor matrix T that consists of the features of all users from every OSN, Tucker decompo- sition derives C, U, V, and W to meet the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. whereas the i-th row and the j-th column of a two-dimensional matrix U are respectively denoted by ui: and u:j . Specifically, Tucker decomposition [31] of a tensor T ∈ RN×D×M is defined as: T = C ×1 U ×2 V ×3 W, (1) where U ∈ RN×R, V ∈ RD×S and W ∈ RM×T are latent matrices. In this paper, the matrix of users’ latent features U plays a crucial role. In Tucker decomposition, R, S, and T are parameters to be set according to different criteria [31]. The 1-mode product of C ∈ RR×S×T and U ∈ RN×R, denoted by C ×1 U, is a ma- trix with size N × S × T , where each element (C ×1 U) nst = R r=1 crst urn . Given the input tensor matrix T that consists of the features of all users from every OSN, Tucker decompo- sition derives C, U, V, and W to meet the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 λ1 T λ2 2 Notice that factors for a does not en friends to b increases in pearing in t be smaller i Therefore, t acteristics o i.e., the user auxiliary in To prope weighted ad the Laplaci where D is We present each elemen gradient, w follows: ∇ui: L = ∇ vi: L =
  25. Tucker Decomposition • C needs to be diagonal • U,

    V, W need to be orthogonal element (i, j, k) of a three-mode tensor T is denoted by tijk , whereas the i-th row and the j-th column of a two-dimensional matrix U are respectively denoted by ui: and u:j . Specifically, Tucker decomposition [31] of a tensor T ∈ RN×D×M is defined as: T = C ×1 U ×2 V ×3 W, (1) where U ∈ RN×R, V ∈ RD×S and W ∈ RM×T are latent matrices. In this paper, the matrix of users’ latent features U plays a crucial role. In Tucker decomposition, R, S, and T are parameters to be set according to different criteria [31]. The 1-mode product of C ∈ RR×S×T and U ∈ RN×R, denoted by C ×1 U, is a ma- trix with size N × S × T, where each element (C ×1 U) nst = R r=1 crst urn . Given the input tensor matrix T that consists of the features of all users from every OSN, Tucker decompo- sition derives C, U, V, and W to meet the above equality on while vectors are denoted by boldface lowercase letters, e.g., u. Matrices are represented by boldface capital letters, e.g., U, and tensors are denoted by calligraphic letters, e.g., T . Each element (i, j, k) of a three-mode tensor T is denoted by tijk , whereas the i-th row and the j-th column of a two-dimensional matrix U are respectively denoted by ui: and u:j . Specifically, Tucker decomposition [31] of a tensor T ∈ RN×D×M is defined as: T = C ×1 U ×2 V ×3 W, (1) where U ∈ RN×R, V ∈ RD×S and W ∈ RM×T are latent matrices. In this paper, the matrix of users’ latent features U plays a crucial role. In Tucker decomposition, R, S, and T are parameters to be set according to different criteria [31]. The 1-mode product of C ∈ RR×S×T and U ∈ RN×R, denoted by C ×1 U, is a ma- trix with size N × S × T , where each element (C ×1 U) nst = R r=1 crst urn . Given the input tensor matrix T that consists of the features of all users from every OSN, Tucker decompo- sition derives C, U, V, and W to meet the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. whereas the i-th row and the j-th column of a two-dimensional matrix U are respectively denoted by ui: and u:j . Specifically, Tucker decomposition [31] of a tensor T ∈ RN×D×M is defined as: T = C ×1 U ×2 V ×3 W, (1) where U ∈ RN×R, V ∈ RD×S and W ∈ RM×T are latent matrices. In this paper, the matrix of users’ latent features U plays a crucial role. In Tucker decomposition, R, S, and T are parameters to be set according to different criteria [31]. The 1-mode product of C ∈ RR×S×T and U ∈ RN×R, denoted by C ×1 U, is a ma- trix with size N × S × T , where each element (C ×1 U) nst = R r=1 crst urn . Given the input tensor matrix T that consists of the features of all users from every OSN, Tucker decompo- sition derives C, U, V, and W to meet the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 λ1 T λ2 2 Notice that factors for a does not en friends to b increases in pearing in t be smaller i Therefore, t acteristics o i.e., the user auxiliary in To prope weighted ad the Laplaci where D is We present each elemen gradient, w follows: ∇ui: L = ∇ vi: L =
  26. STM sition derives C, U, V, and W to meet

    the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 + λ1 2 tr(UT La U) + λ2 2 ∥U∥2, (2) where tr(·) denotes the matrix traces, the Frobenius norm of a tensor T is defined as ∥T ∥ = √ < T , T >, and λ1 and λ2 are parameters controlling the contribution of each part during the above collaborative factorization. L first minimizes the decomposition error, i.e., ∥T −C×1 U×2 V×3 W∥2, for T . Note that Eq. (1) does not always need to hold since other crucial goals are also incorporated in the model. For example, the term that minimizes ∥U∥2 is to derive a more concise latent th w W e g fo A fe th fa le
  27. STM sition derives C, U, V, and W to meet

    the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 + λ1 2 tr(UT La U) + λ2 2 ∥U∥2, (2) where tr(·) denotes the matrix traces, the Frobenius norm of a tensor T is defined as ∥T ∥ = √ < T , T >, and λ1 and λ2 are parameters controlling the contribution of each part during the above collaborative factorization. L first minimizes the decomposition error, i.e., ∥T −C×1 U×2 V×3 W∥2, for T . Note that Eq. (1) does not always need to hold since other crucial goals are also incorporated in the model. For example, the term that minimizes ∥U∥2 is to derive a more concise latent th w W e g fo A fe th fa le Decomposition Error
  28. STM sition derives C, U, V, and W to meet

    the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 + λ1 2 tr(UT La U) + λ2 2 ∥U∥2, (2) where tr(·) denotes the matrix traces, the Frobenius norm of a tensor T is defined as ∥T ∥ = √ < T , T >, and λ1 and λ2 are parameters controlling the contribution of each part during the above collaborative factorization. L first minimizes the decomposition error, i.e., ∥T −C×1 U×2 V×3 W∥2, for T . Note that Eq. (1) does not always need to hold since other crucial goals are also incorporated in the model. For example, the term that minimizes ∥U∥2 is to derive a more concise latent th w W e g fo A fe th fa le Regularization
  29. STM sition derives C, U, V, and W to meet

    the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 + λ1 2 tr(UT La U) + λ2 2 ∥U∥2, (2) where tr(·) denotes the matrix traces, the Frobenius norm of a tensor T is defined as ∥T ∥ = √ < T , T >, and λ1 and λ2 are parameters controlling the contribution of each part during the above collaborative factorization. L first minimizes the decomposition error, i.e., ∥T −C×1 U×2 V×3 W∥2, for T . Note that Eq. (1) does not always need to hold since other crucial goals are also incorporated in the model. For example, the term that minimizes ∥U∥2 is to derive a more concise latent th w W e g fo A fe th fa le
  30. STM sition derives C, U, V, and W to meet

    the above equality on Tndm for every n, d, and m, where C needs to be diagonal, and U, V , and W are required to be orthogonal [31]. By regard- ing ui: in U as the latent features of user i, we can efficiently integrate the information from different networks for i. Equipped with tensor decomposition on T , we propose a new SNMD-based Tensor Model (STM) to minimize the fol- lowing objective function L, L(U, V, W, C) = 1 2 ∥T − C ×1 U ×2 V ×3 W∥2 + λ1 2 tr(UT La U) + λ2 2 ∥U∥2, (2) where tr(·) denotes the matrix traces, the Frobenius norm of a tensor T is defined as ∥T ∥ = √ < T , T >, and λ1 and λ2 are parameters controlling the contribution of each part during the above collaborative factorization. L first minimizes the decomposition error, i.e., ∥T −C×1 U×2 V×3 W∥2, for T . Note that Eq. (1) does not always need to hold since other crucial goals are also incorporated in the model. For example, the term that minimizes ∥U∥2 is to derive a more concise latent th w W e g fo A fe th fa le • Take the output U as the feature vectors of users
  31. Data • Data from Amazon Mechanical Turk • 3126 people

    • 1790 Male, 1336 Female • 389 SNMD, 2737 non-SNMD • 246 CR • 267 IO • 73 NC 43% 57% Male Female 88% 12% SNMD Non-SNMD
  32. Data from 3126 Users 1. Fill out SNMD questionnaires 2.

    Psychiatrists label the user
 (potential SNMD or not) 3. Crawl FB and IG data use API from these user
  33. Effective of Features • All • Social • Personal •

    Duration (Use only duration to detect)
  34. Large Scale Experiment • Whether friends of SNMD tend to

    be potential SNMD • Without Ground Truth
  35. Result d of per- CR ion- hip. pes, e, in

    Parasociality Game posts Median of BI Median of BI Online/offline ratio Online/offline ratio Sticker number Parasociality SD of BL Online/offline ratio Number of selfies Sticker number CC. CC. Parasociality Acc.: 80.2% Acc.: 76.8% Acc.: 82.7% Table 5: Feature effectiveness analysis: SNMDD ac- curacy on the FB US dataset. Used Features Accuracy Used Features Accuracy PR 56.9% All–PR 78.2% ONOFF 60.3% All–ONOFF 75.1% SC 40.1% All–SC 78.8% SSB 44.4% All–SSB 79.3% SD 58.9% All–SD 73.2% TEMP 67.5% All–TEMP 68.1% UT 36.4% All–UT 82.6% DIS 54.0% All–DIS 75.9% PROF 18.2% All–PROF 81.5% All 83.1% personal features, e.g., the temporal behavior features, can be 5 3 6 7 2 1 9 4 8
  36. Inference • Number of selfies • Useful for detecting NC.

    Not useful for IO, CR • Since NC are less socially active • Parasociality • Effective on all SNMD types (Especially for CR) • Burst intensity and length • Useful for detecting IO • PROF is the least important
  37. Result 0 2 4 6 8 10 12 Relat in

    Nu Number of Features (a) Relative improvement w.r.t. the number of fea- tures. 0 2 4 6 8 10 12 Relat in Nu Number of Features (b) Relative improvement w.r.t. the number of latent features (STM ). Figure 2: Relative accuracy change with respect to number of features. 0% 15% 30% 45% 60% 75% CR NC IO NA Propotion of Friends with SNMDs in FB_L SNMD Type CR NC IO NA (a) SNMD types of friends (FB L). 0% 15% 30% 45% 60% CR NC IO NA Propotion of Friends with SNMDs in IG_L SNMD Type CR NC IO NA (b) SNMD types of friends (IG L). 20% 40% 60% MD Users in ty of FB_L CR NC IO 40% 60% 80% MD Users in ty of IG_L CR NC IO se in us A th SN co clo IG fir m N te co sm les ar Th st us
  38. Q A