Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automatic Persona Generation

Joni
October 26, 2021

Automatic Persona Generation

Dr. Joni Salminen
Invited talk at IT University Copenhagen
September 30, 2021

https://persona.qcri.org

#personas #data #analytics

Joni

October 26, 2021
Tweet

More Decks by Joni

Other Decks in Research

Transcript

  1. Introduction & Current Challenges Dr. Joni Salminen September 30, 2021

    IT University Copenhagen Automatic Persona Generation
  2. Meet the APG Team! Professor Jim Jansen The Leader (Principal

    Scientist) • Inventor of APG • Leads the project • Customer relationships & management MSc. Soon-gyo Jung The Genius (Software Engineer) • Creator of APG • Front-End / Back-End • Implements like a genius, hence the nickname Dr. Joni Salminen The Handyman (Scientist) • Helps with user studies, system development, etc. • Strategic guy, likes to think the big picture
  3. Giving faces to user data? • Personas… • Summarize relevant

    user information for decision makers that need that information • Are an alternative (or complement) to numbers • Provide a different way of doing user/customer analytics (more approachable & memorable) …are not just about visualization, but empathetic representations of users! Nielsen, L. (2019). Personas—User Focused Design (2nd ed. 2019 edition). Springer.
  4. Literally, faces! Personification = nameless, faceless segments are turned into

    personas that describe a behavioral and demographic pattern in the data Enrichment = enriching the persona profiles with additional information such as sentiment, loyalty, quotes, most viewed content, and topics of interest
  5. The process relies on data dimensionality reduction (Non-negative matrix factorization

    Jung, S., Salminen, J., Kwak, H., An, J., & Jansen, B. J. (2018). Automatic Persona Generation (APG): A Rationale and Demonstration. CHIIR ’18: Proceedings of the 2018 Conference on Human Information Interaction & Retrieval, 321–324. https://doi.org/10.1145/3176349.3176893
  6. Three ways in which “Personified Big Data” drives the automation

    of personas 1. Access to online analytics and social media platforms via application programming interfaces (APIs) for end-user data 2. Standardized format of aggregated end-user data (engagement metrics, demographic groups) 3. Data analysis algorithms, libraries and software tools that enable automation of whole pipeline from data collection to persona generation to serving via interactive persona systems (end-to-end). Salminen, J., Guan, K., Jung, S.-G., & Jansen, B. J. (2021). A Survey of 15 Years of Data-Driven Persona Development. International Journal of Human–Computer Interaction, 0(0), 1–24. https://doi.org/10.1080/10447318.2021.1908670
  7. Why automate persona generation? Personas are usually created with manual

    methods (i.e., interviews & ethnography), methods that are expensive and slow to implement, and they can quickly become outdated. Because of the limitations, personas risk being inaccurate representations of the true user base. Better personas Better decisions Better results. In contrast, APG provides personas that are fast to create and updated automatically. This means the cost of persona creation is dramatically reduced, making them available for organizations with limited means (e.g., startups, small businesses). Depending on the underlying dataset, APG can cover a wide range of behaviors and demographics. Manual methods Automation An, J., Kwak, H., Salminen, J., Jung, S., & Jansen, B. J. (2018). Imaginary People Representing Real Numbers: Generating Personas from Online Social Media Data. ACM Transactions on the Web (TWEB), 12(4), 27. https://doi.org/10.1145/3265986
  8. The brief history of data-driven personas (1999-2021) 2006: Mulder &

    Yaar Defined “Quantitative Personas” and different method types (also Grudin and Pruitt had done in 2002 and 2003) 2008: McGinn & Kotamraju “Data-Driven Persona Development” • Provides statistical validation • Drawback: survey data 1999: Cooper Establishes the need for personas in software development, design, and HCI 2015: Zhang et al. “ Clickstream Personas” • Used click data (online analytics) • Drawback: superficial personas (no demographics) 2016: An et al. “Automatic Persona Generation” • Introduces social media data for persona generation (both text and numbers) • Introduces plans and vision for a system • Drawbacks: many observed challenges 2017: Jung et al. “Automatic Persona Generation” • Introduces an interactive persona system using an ML pipeline and Web technologies • Drawbacks: many observed challenges 2021: Salminen et al. “Persona Analytics” Introduces eye- and mouse- tracking of persona users as a method for producing knowledge for persona science 2021: Jansen et al. “Data-Driven Personas: The Book” • Summarizes five years of academic research and system development • Defines a roadmap for the future
  9. Macquarie University “Holistic Personas” IT University Copenhagen “Design Personas” Heilbronn

    Hochschule “Critical Personas” QCRI “Data-Driven Personas”
  10. Research Roadmap for Automatic Persona Generation (APG) Information architecture: How

    to choose relevant persona information content and presentation for a given user, use case, and industry? Quotes: How to find demographically matching, non-toxic comments that describe the persona’s attitudes and are relevant for end users? Temporal analysis: How to analyze change of personas over time? APG is about finding better ways to process and choose useful user information from vast amounts of online data. ”Personas are about giving faces to data.” Applicability: How to create personas for specific industries (e.g., e-health, e-commerce, politics, gaming…)? Image: How to automatically generate, tag, and choose appropriate persona profile pictures? Evaluation: (1) How to ensure personas are of high quality (complete, clear, consistent and credible)? (2) How to measure usefulness of personas for individuals and organizations? Attributes & Topics of Interest: How to automatically infer user attributes, such as interests, needs, wants, goals, political orientation, and brand affinity from social media? Salminen, J., Jansen, B. J., An, J., Kwak, H., & Jung, S. (2019). Automatic Persona Generation for Online Content Creators: Conceptual Rationale and a Research Agenda. In L. Nielsen (Ed.), Personas—User Focused Design (2nd ed., pp. 135–160). Springer London. https://doi.org/10.1007/978-1-4471-7427-1_8
  11. APG’s links to Computer Science Challenge Potential solutions Image Generative

    Adversarial Networks (GANs) Persona Attributes Text Classification, Topic Modeling (LDA) Quotes Hate Speech Detection, Natural Language Processing (NLP) Persona Change Anomaly Detection, Concept Drift, Similarity Metrics, Tensor Factorization (TF)… Information Architecture User Studies, Crowd Experiments, Human-Computer Interaction (HCI), Adaptive / Intelligent Systems, User Modeling, Information Science (IS) Persona Evaluation Factor Analysis, Structural Equation Modeling (SEM), Experiments, User Experience (UX), Usability and User Interface (UI) Design
  12. Issues about Pictures • Need for manual supervision / validation

    • Demographically imbalanced datasets • Currently conditional generation is not supported
  13. Salminen, J., Jung, S., Kamel, A. M. S., Santos, J.

    M., & Jansen, B. J. (2020). Using artificially generated pictures in customer-facing systems: An evaluation study with data-driven personas. Behaviour & Information Technology, 0(0), 1– 17. https://doi.org/10.1080/0144929X.2020.1838610
  14. Issues about Algorithm • Is clustering or dimensionality reduction meaningful

    for user segmentation in the first place? • From a diversity standpoint, it seems no • Diversity maximization or using diversity as a goal has been largely ignored in user segmentation and persona creation • …how many personas should be created? (Depends on the goal: what is the goal??) • What algorithm performs the best? And, what METRIC is the most appropriate (e.g., statistical distance vs. diversity)
  15. Issues about Algorithm • Concept drift / topic drift /

    model drift… • All refer to CHANGE in the underlying user behavior (data) • How often should personas be changed? How should the change be measured / detected?
  16. Issues about Quotes • Bødker’s ”Frankenstein problem”: inconsistency of persona

    information • How to match the quotes with the personas’ demographics? • Inconvenient cases: man  woman, Indian  Pakistanese, etc. (cultural sensibilities (Häkkilä et al.))
  17. Data is available but what about information? • Attitudes, fears,

    doubts, hopes, needs, wants… can these be inferred from numbers? • Tweets contain a lot… Rosetta’s Stone for data-driven personas: user modeling / soft attribute inference from smartly sampled tweets • …even more important because persona users’ information needs are unique --- need to have flexible tools for them to query persona attitudes in real-time (static data-driven personas won’t do)
  18. Towards persona science? • Persona analytics = how decision-makers (i.e.,

    persona users) in organizations use personas as analytical tools to better understand their users or customers. • Persona analytics = how persona creators or researchers investigate the behaviors of persona users. We define ‘persona analytics’ (PA) as the systematic measurement of behaviors and interactions of persona users engaged with interactive persona systems. When personas are provided through a web browser, PA takes place via mouse- (and eye-)tracking that records the persona users’ mouse (or gaze) movements and clicks (eye fixations) on the provided persona profiles and their information elements.
  19. Empirical Persona User Research (1) How do users interact with

    personas? (2) What persona information do users pay attention to? (3) What persona information causes users to change/reinforce their attitudes? (4) What persona information influences users’ decision making and how? (5) How and why do users choose a persona for their task? → Unified theory of personas? Jung, S., Salminen, J., & Jansen, B. J. (2021). Persona Analytics: Implementing Mouse-tracking for an Interactive Persona System. Extended Abstracts of ACM Human Factors in Computing Systems - CHI EA ’21.
  20. Jung, S., Salminen, J., & Jansen, B. J. (2021). Persona

    Analytics: Implementing Mouse-tracking for an Interactive Persona System. Extended Abstracts of ACM Human Factors in Computing Systems - CHI EA ’21.
  21. Next steps • Metrics • What measures and metrics we

    want to analyze? • Hypotheses • Intervention → expected change in persona users’ behavior Persona-based metrics User-based metrics Time spent per persona = Number of visits per persona = Persona revisit frequency = Number of personas visited = Persona coverage = Persona visit distribution = Rank correlation = Table 1: Persona Analytics metrics. Behavioral matters such as order effects, revisit frequency, persona comparisons, satisficing behavior, and choice can be investigated deploying the persona state- transition matrix and Markov Chain techniques. Persona information design can be informed by dwell time analyses, and typical persona viewing patterns and information viewing patterns can be deduced in interactive persona user studies using a live system. Jung, S., Salminen, J., & Jansen, B. J. (2021). Persona Analytics: Implementing Mouse-tracking for an Interactive Persona System. Extended Abstracts of ACM Human Factors in Computing Systems - CHI EA ’21.
  22. Data-driven personas have room for all lines of research •

    Algorithmically oriented people can solve algorithmic problems in generation, validation, updating, etc. • Qualitatively oriented research can carry out user studies (e.g., observation, interviews) • Empirically oriented researchers can conduct experiments using real systems and controlled conditions • Theoretically oriented scholars can attempt to formulate theories of persona use and persona-user interaction …join the family ☺
  23. Thank you! Dr. Joni Salminen [email protected] The APG family (2019)

    Get the book from Amazon! (or your library)