Upgrade to Pro — share decks privately, control downloads, hide ads and more …

KU Presentation

Isa Dutse
March 09, 2020

KU Presentation

My work: then, now and later

Isa Dutse

March 09, 2020
Tweet

More Decks by Isa Dutse

Other Decks in Research

Transcript

  1. My work: then, now and later Dr Isa Inuwa-Dutse, FHEA,

    MBCS, MRSS PhD (Edge Hill) ·MSc(Manchester) · BSc(BayeroUniversity) School of Engineering and Computer Science University of Hertfordshire, UK March ,
  2. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Outline Introduction Teaching Experience Research Experience Network Analysis Community Detection Pipeline Current and Future Work /
  3. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Teaching Experience University of Hertfordshire ( – date) - Visiting Lecturer, teaching Computational Algorithms and Paradigms to masters students /
  4. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Teaching Experience University of Hertfordshire ( – date) - Visiting Lecturer, teaching Computational Algorithms and Paradigms to masters students Edge Hill University ( – ) - Graduate Teaching Assistant /
  5. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Teaching Experience University of Hertfordshire ( – date) - Visiting Lecturer, teaching Computational Algorithms and Paradigms to masters students Edge Hill University ( – ) - Graduate Teaching Assistant Federal University Dutse ( – ) - Lecturer, Department of Computer Science /
  6. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Teaching Experience University of Hertfordshire ( – date) - Visiting Lecturer, teaching Computational Algorithms and Paradigms to masters students Edge Hill University ( – ) - Graduate Teaching Assistant Federal University Dutse ( – ) - Lecturer, Department of Computer Science School of Technology Kano ( – ) - Lecturer, Department of Statistics and Computer Science /
  7. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Teaching Experience University of Hertfordshire ( – date) - Visiting Lecturer, teaching Computational Algorithms and Paradigms to masters students Edge Hill University ( – ) - Graduate Teaching Assistant Federal University Dutse ( – ) - Lecturer, Department of Computer Science School of Technology Kano ( – ) - Lecturer, Department of Statistics and Computer Science Future Teaching: - interested in NLP/ML-related modules ... and other related modules /
  8. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Teaching Experience University of Hertfordshire ( – date) - Visiting Lecturer, teaching Computational Algorithms and Paradigms to masters students Edge Hill University ( – ) - Graduate Teaching Assistant Federal University Dutse ( – ) - Lecturer, Department of Computer Science School of Technology Kano ( – ) - Lecturer, Department of Statistics and Computer Science Future Teaching: - interested in NLP/ML-related modules ... and other related modules /
  9. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Research Interests: ·artificial intelligence · text mining · social network analysis · computational sociometry· Motivated by the revolutionary e fects brought about by contemporary social media ecosystem /
  10. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Research Interests: ·artificial intelligence · text mining · social network analysis · computational sociometry· Motivated by the revolutionary e fects brought about by contemporary social media ecosystem - supports various forms of interactions at di ferent layers of granularity and intensity /
  11. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Research Interests: ·artificial intelligence · text mining · social network analysis · computational sociometry· Motivated by the revolutionary e fects brought about by contemporary social media ecosystem - supports various forms of interactions at di ferent layers of granularity and intensity... promotes datafication /
  12. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Research Interests: ·artificial intelligence · text mining · social network analysis · computational sociometry· Motivated by the revolutionary e fects brought about by contemporary social media ecosystem - supports various forms of interactions at di ferent layers of granularity and intensity... promotes datafication - participatory web: while empowering, it poses many challenges to mining-related tasks /
  13. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Research Interests: ·artificial intelligence · text mining · social network analysis · computational sociometry· Motivated by the revolutionary e fects brought about by contemporary social media ecosystem - supports various forms of interactions at di ferent layers of granularity and intensity... promotes datafication - participatory web: while empowering, it poses many challenges to mining-related tasks My research is aimed at contributing toward a deeper understanding and consequences of such interactions from the perspective of computational social science /
  14. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Research Interests: ·artificial intelligence · text mining · social network analysis · computational sociometry· Motivated by the revolutionary e fects brought about by contemporary social media ecosystem - supports various forms of interactions at di ferent layers of granularity and intensity... promotes datafication - participatory web: while empowering, it poses many challenges to mining-related tasks My research is aimed at contributing toward a deeper understanding and consequences of such interactions from the perspective of computational social science - focused on the problems of online content veracity and community detection /
  15. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Overview free raw materials generated by users ... data - structured, e.g. in database - unstructured, e.g. text, video files, social media data, etc - semi-structured, e.g. web logs, click through, xml numerous strategies to ease data generation many challenges: spam, fake news, social bots, privacy, availability /
  16. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    The eccentricity of Twitter connections challenges mining tasks (a) dyads (b) RT and hashtag (c) @mention /
  17. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    The eccentricity of Twitter connections challenges mining tasks (a) dyads (b) RT and hashtag (c) @mention (d) Relative proportion Figure: Connection types on Twitter /
  18. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    SPD strategy designed to filter out irrelevant content some spam and social bot accounts exhibit certain similarities, e.g. profile information /
  19. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    SPD strategy designed to filter out irrelevant content some spam and social bot accounts exhibit certain similarities, e.g. profile information ... instrumental in the community detection approach /
  20. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Di ficult to keep track of socially cohesive groups on Twitter ... inverse relationship between socialcohesion and networksize [ ] ... the large network size a fects user’s ability to maintain a cohesive social relationship - Twitter: ≈ M daily users & ≈ M content items www.omnicoreagency.com/twitter-statistics /
  21. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Methodological viewpoints Social network theorists hold two methodological positions for investigating social ties: realist and nominalist /
  22. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Methodological viewpoints Social network theorists hold two methodological positions for investigating social ties: realist and nominalist - Social links formation: event-typetie or state-typetie Moreover, communities are formed around two primary modalities: networkstructure and attributes - work in social media networks, especially Twitter, mostly focuses on a single modality /
  23. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Methodological viewpoints Social network theorists hold two methodological positions for investigating social ties: realist and nominalist - Social links formation: event-typetie or state-typetie Moreover, communities are formed around two primary modalities: networkstructure and attributes - work in social media networks, especially Twitter, mostly focuses on a single modality Level of trust is stronger among cliques[ ] ... /
  24. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Methodological viewpoints Social network theorists hold two methodological positions for investigating social ties: realist and nominalist - Social links formation: event-typetie or state-typetie Moreover, communities are formed around two primary modalities: networkstructure and attributes - work in social media networks, especially Twitter, mostly focuses on a single modality Level of trust is stronger among cliques[ ] ...... and socialhomophily [ ] cognitivebalancetheory: strong ties in a small group prevent unethical behaviour[ ] /
  25. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Methodological viewpoints Social network theorists hold two methodological positions for investigating social ties: realist and nominalist - Social links formation: event-typetie or state-typetie Moreover, communities are formed around two primary modalities: networkstructure and attributes - work in social media networks, especially Twitter, mostly focuses on a single modality Level of trust is stronger among cliques[ ] ...... and socialhomophily [ ] cognitivebalancetheory: strong ties in a small group prevent unethical behaviour[ ] Groups of nodes with reciprocal ties will be more helpful in mining tasks: - Simmeliantie [ ], a strong social relationship within groupsofthreeormore - My recent work reported how Simmelian ties and social homophily can be harnessed to improve mining-related tasks /
  26. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Basic units of structural relationships – dyadic and transitive ties given the sets of nodes a, b, c, ...., n ∈ V and edges e , e , ...en ∈ E V, E ∈ D - the target is to find the likelihood of reciprocity p(Ra,b ) between any pair of nodes https://github.com/ijdutse/simmelian_ties_on_Twitter /
  27. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Basic units of structural relationships – dyadic and transitive ties given the sets of nodes a, b, c, ...., n ∈ V and edges e , e , ...en ∈ E V, E ∈ D - the target is to find the likelihood of reciprocity p(Ra,b ) between any pair of nodes - collected and analysed user profiles from m accounts (Table ) - the data is freely available for further analysis Table: Dataset Summary. C: Category; S: Seed Size; V: Visited users; P: Pairwise ties; T: Transitive ties; D: Search duration C S V P T D (min.) :verified , , , – , :verified , , , , – , :verified , , , , , :unverified , , , – , :unverified , , , , – , :unverified , , , , , ego-Twitter , – – – – https://github.com/ijdutse/simmelian_ties_on_Twitter /
  28. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Meta-analysis: exploratory analyses for better insights Distribution of ties and relevant metrics in the data using ECDF, which gives the probability of a quantity evaluated at arbitrary points - x axis: the measuredquantity y axis: a fraction of the data (%) useful in answering questions; e.g. how a certain quantity varies vis-a-vis another usercategory: a higher proportion of reciprocal ties exists in the unverified users category /
  29. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Observations on the likelihood of reciprocity Likelihood of reciprocity and profile attributes a user’s network size is likely to grow if the user: - is verified ... /
  30. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Observations on the likelihood of reciprocity Likelihood of reciprocity and profile attributes a user’s network size is likely to grow if the user: - is verified ... and has many followers /
  31. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Observations on the likelihood of reciprocity Likelihood of reciprocity and profile attributes a user’s network size is likely to grow if the user: - is verified ... and has many followers the likelihoodofreciprocity is high if: /
  32. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Observations on the likelihood of reciprocity Likelihood of reciprocity and profile attributes a user’s network size is likely to grow if the user: - is verified ... and has many followers the likelihoodofreciprocity is high if: - the user is unverified ... /
  33. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Observations on the likelihood of reciprocity Likelihood of reciprocity and profile attributes a user’s network size is likely to grow if the user: - is verified ... and has many followers the likelihoodofreciprocity is high if: - the user is unverified ... has a relatively large network unverifiedusers are more likely to reciprocate a followership /
  34. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Observations on the likelihood of reciprocity Likelihood of reciprocity and profile attributes a user’s network size is likely to grow if the user: - is verified ... and has many followers the likelihoodofreciprocity is high if: - the user is unverified ... has a relatively large network unverifiedusers are more likely to reciprocate a followership users with large networks (> K) have a low proportion of reciprocated ties /
  35. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Bayesian process of simulating the real data and making inference why? Due to the rarity of transitiveties /
  36. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Bayesian process of simulating the real data and making inference why? Due to the rarity of transitiveties Given a set of nodes X, Bayesian analysis is useful to estimate: - mean reciprocity, mean values for features such as indegree,outdegree in the ground-truth data /
  37. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Bayesian process of simulating the real data and making inference why? Due to the rarity of transitiveties Given a set of nodes X, Bayesian analysis is useful to estimate: - mean reciprocity, mean values for features such as indegree,outdegree in the ground-truth data - quantify the uncertainty associated with the data and its scaled-version Figure: Bayesian analysis workflow /
  38. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Using user-centric attributes to estimate likelihood of reciprocity (a) Possible triads and (b) relevant features /
  39. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Using user-centric attributes to estimate likelihood of reciprocity (a) Possible triads and (b) relevant features nodes can be grouped according structural similarity, which is based on link information ... such information is o ten not available, hence the need to infer/predict key challenges: /
  40. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Using user-centric attributes to estimate likelihood of reciprocity (a) Possible triads and (b) relevant features nodes can be grouped according structural similarity, which is based on link information ... such information is o ten not available, hence the need to infer/predict key challenges: complete information about the network is an NP-hard problem propose a model to predict ties between users /
  41. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Using user-centric attributes to estimate likelihood of reciprocity (a) Possible triads and (b) relevant features nodes can be grouped according structural similarity, which is based on link information ... such information is o ten not available, hence the need to infer/predict key challenges: complete information about the network is an NP-hard problem propose a model to predict ties between users - modelled as functions of easily accessible features f ∈ X /
  42. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Reciprocity Prediction: doessimilaritybetweenusers’attributesinducereciprocity? using attributes of nodes to predict reciprocal tie ... focus on easily accessible attributes: Xf , networksize,indegree(ind),outdegree(out), category(cat), thus: {ind, out, cat} ⊂ Xf for a pair of nodes vi, vj, their corresponding features are given by: Xfvi = {indvi , outvi , catvi }, Xfvj = {indvj , outvj , catvj } /
  43. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Reciprocity Prediction: doessimilaritybetweenusers’attributesinducereciprocity? using attributes of nodes to predict reciprocal tie ... focus on easily accessible attributes: Xf , networksize,indegree(ind),outdegree(out), category(cat), thus: {ind, out, cat} ⊂ Xf for a pair of nodes vi, vj, their corresponding features are given by: Xfvi = {indvi , outvi , catvi }, Xfvj = {indvj , outvj , catvj } - ... ratio of corresponding attributes, e.g.indvi indvj ∈ R ∀f ∈ Xfvi,vj - ... if the ration lies [ . , . ], return similarity ( ), else dissimilarity ( ) - ... the resulting binaries are used to compute the overall similarity between pairs using JaccardSimilarityCoe ficient,J (Eq. ): J(Xfvi , Xfvj ) = |Xfvi ∩ Xfvj | |Xfvi ∪ Xfvj | ( ) /
  44. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Each prediction is associated with a decision error ... modelled using a probabilisticpreferencemodel or responseprobability - response probability capture various scenarios in which an actor is o fered a set of features - ... and the decision process is associated with a constant probability of making an error in the choice, tremblinghanderror [ ] thus, the error term vi,vj (Eq. ) is given as a function of the similarity index J(vi, vj ) (Eq. ), between pairs: vi,vj = ζ × ( + log(J(vi, vj ) + ζ)) ( ) ζ corresponds to the constanterrorterm and the final relation is given by: (Eq. ): p(Rvi,vj ) = + exp ϕ ( ) where: ϕ = − log( vi,vj + J(vi, vj )) × ( vi,vj + J(vi, vj )) we use ζ ≥ . and each item in the predicted ties, κ, satisfies (Eq. ) /
  45. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Eq. enables the computation of the probability of reciprocity we can identify as many nodes with a high likelihood of establishing reciprocal ties as required adding a layer of social cohesion related to community detection /
  46. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Eq. enables the computation of the probability of reciprocity we can identify as many nodes with a high likelihood of establishing reciprocal ties as required adding a layer of social cohesion related to community detection The likelihood of a reciprocal tie between any pair of users: L(Rvi,vj ) = − f∈χs ( − p(Rvi,vj )) ( ) Eq. defines a generative process where p(Rvi,vj ) is the marginal reciprocity e fect of each feature f ∈ Xf /
  47. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Utility of Simmelian ties Hop-skippers are helpful in community detection Contentveracity: - they represent strong relationships, vital in preventing unethical behaviours [ , ] informationdi fusion: - ...some nodes act as hop-skippers , i.e. users with many reciprocal ties - instrumental in connecting disparate community parts We use the term hop-skippers á la [ ] /
  48. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Enhancing community detection Community detection on three di ferent datasets G–N LP Dataset Metric #DC Metric #DC Q NMI Q NMI Ground-truth . . . . ego-Twitter . . . . Predicted . . . . clustering using a collection of nodes with Simmelian ties – ground-truth vs. predicted G–N and LP: Girvan–Neuman[ and Label Propagation[ ] algorithms #DC: Number of Detected Communities /
  49. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Multilevel Clustering Technique (MCT) The MCT framework incorporates those insights ... Based on the bimodal approach - recognises structural and nodes attributes for community detection the MCT framework consists of interdependent units for the detection of cohesive communities, termed microcosms, on Twitter /
  50. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Working on: Graph-embedding for community detection Motivated by the success of wordembedding techniques The goal is to develop an e fective technique of embedding graph data (augmenting both textual and structural information) Embedding: maps a high dimensional data to a low dimensional space with minimal loss of integrity Prevailing challenges: - graph data is usually huge, making a direct application of embedding techniques computationally expensive - ... studies o ten resolve to heuristics for sub-optimum performance - augmenting both structural and textual attribute is still a challenge /
  51. Introduction Network Analysis Community Detection Pipeline Current and Future Work

    Future Work The success of deep learning models and the current data deluge excite researchers Planned to harness social media power to empirically validate relevant theories in social network analysis - e.g., questions about the role of the hop-skippers for optimum information di fusion and content validation - study sociometry from the perspectives of homophily, centrality metrics, casual acquaintances and structural holes - manifestation of multiplexity of relationships and constructionism to investigate new developments in community detection algorithms especially concerning the evaluation of the proposed microcosm detection framework - of interest, to assess how active or visible are the communities built based on dyadic or Simmelian ties? These are some of the open research agendas to explore - ... and develop relevant proposals for funding and seek collaborations /