Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data-Driven Personas: A Tutorial

1f05bb79049dfcf3931294e4e11465b3?s=47 Joni
June 28, 2022

Data-Driven Personas: A Tutorial

Tutorial at the HCI International 2022. Organizers Dr. Joni Salminen, Dr. Jim Jansen, MSc. Soon-gyo Jung. More information about the mentioned systems: Automatic Persona Generation (https://persona.qcri.org), Survey2Persona (https://s2p.qcri.org), and research: https://persona.qcri.org/persona-research



June 28, 2022

More Decks by Joni

Other Decks in Research


  1. Tutorial on Data-Driven Personas (HCII ‘22) Joni Salminen University of

    Vaasa, Vaasa, Finland (jonisalm@uwasa.fi) Jim Jansen Qatar Computing Research Institute, Doha, Qatar (bjansen@hbku.edu.qa) Soon-gyo Jung Qatar Computing Research Institute, Doha, Qatar (sjung@hbku.edu.qa)
  2. The persona team (Qatar & Finland) Professor Jim Jansen The

    Leader (Principal Scientist) • Inventor of Automatic Persona Generation • Leads the project • Customer relationships & management MSc. Soon-gyo Jung The Wizard (Software Engineer) • Creator of multiple interactive persona systems • Front-End / Back-End • Implements like a genius, hence the nickname Dr. Joni Salminen The Handyman (Researcher) • Helps with user studies, system development, etc. • Strategic guy, likes to think the big picture ? ? YOU? If you feel like having an adventure, join us :) Reach out: Dr. Jim Jansen bjansen@hbku.edu.qa
  3. None
  4. None
  5. Tutorial content 1. Introduction to data-driven personas (lecture-type) (~45 mins)

    2. Intro to hands-on assignment (~15 mins) 3. (Carrying out the task!) (i.e., creating personas) (2 hours) 4. Presenting the personas and discussing them in group (1 hour)
  6. What is a persona? • A ‘persona’ is a fictive

    person describing an important user or customer group. [1] • Personas simplify numerical data into an easily understandable format: another human being • Personas help communicate numbers in the organization, so that decisions can be made keeping the users or customers in mind. • Personas are usually presented in profiles, although implicit use of personas is becoming more common. (An automatically generated persona profile) [1] Cooper, A. (1999). The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity (1 edition). Sams - Pearson Education.
  7. Why personas? Because they… • Summarize information about end-users /

    customers for decision makers • Are an alternative (or complement!) to raw numbers, statistics, and figures/tables • Provide a different way of doing user/customer analytics (in some ways more approachable & memorable) • Give faces to user data …are not just about visualization, but empathetic representations of users! [1] [1] Nielsen, L. (2019). Personas—User Focused Design (2nd ed. 2019 edition). Springer.
  8. …but why data-driven personas?

  9. Why data-driven personas? [1] Personas are usually created with manual

    methods (i.e., interviews & ethnography), methods that are expensive and slow to implement, and they can quickly become outdated. Because of the limitations, personas risk being inaccurate representations of the true user base. Better personas Better decisions Better results. In contrast, APG provides personas that are fast to create and updated automatically. This means the cost of persona creation is dramatically reduced, making them available for organizations with limited means (e.g., startups, small businesses). Depending on the underlying dataset, APG can cover a wide range of behaviors and demographics. Manual methods Automation [1] An, J., Kwak, H., Salminen, J., Jung, S., & Jansen, B. J. (2018). Imaginary People Representing Real Numbers: Generating Personas from Online Social Media Data. ACM Transactions on the Web (TWEB), 12(4), 27. https://doi.org/10.1145/3265986
  10. Three drivers for the automation of personas [1] 1. Access

    to online analytics and social media platforms via application programming interfaces (APIs) for end-user data 2. Standardized format of aggregated end-user data (engagement metrics, demographic groups) 3. Data analysis algorithms, libraries and software tools that enable automation of whole pipeline from data collection to persona generation to serving via interactive persona systems (end-to-end). [1] Salminen, J., Guan, K., Jung, S.-G., & Jansen, B. J. (2021). A Survey of 15 Years of Data-Driven Persona Development. International Journal of Human–Computer Interaction, 0(0), 1–24. https://doi.org/10.1080/10447318.2021.1908670
  11. APG – Automatic Persona Generation Keeping the focus on a

    person! “Personas give faces to data.” A lot of numbers … James, a 22-year-old single, sales, frequent traveler. …and they are a great way to communicate within a team or organization. vs.
  12. Fully deployed: https://persona.qcri.org • A system for automatically creating personas

    from online analytics data • Proven capability to process hundreds of millions of user interactions from YouTube, Facebook, and Google Analytics. Plus, individual customer data! • Stable, robust stack using Flask framework, PostgreSQL database, Pandas/scikit-learn libraries.
  13. The idea of ’full-stack personas’ Strategic Common focus across business

    units representing heterogenous customer populations. Operational Use to create better content, products or other customer outputs. Tactical Data and numbers available for the ‘last mile’. Abstract with personas to Concrete with numbers all in one system! Few numbers Aggregate numbers All the numbers
  14. Survey2Persona • a tool for analysis and visualization of survey

    data • requires no knowledge of statistics from the user – all processing via point-and-click interfaces • transforms numerical survey responses and demographic data into ‘personas’ for actionable insights • provides intuitive filtering affording focusing on one customer segment or portion of survey https://s2p.qcri.org
  15. The End to End Survey2Persona Process! Survey Development Consulting Survey

    Data Collection Persona Generation Human in the Loop All major survey tools The magic! (really, it’s algorithms!) Actionable Insights! Feedback Cycle: Surveys Tailored to Personas optimize processes & increase efficiency via learning, self- correction, and segmentation targeting
  16. Survey2Persona • Local Collaborator: Strategic Marketing, Qatar Foundation Communications •

    Conducted a Global Branding Survey of 1000+ participants across 10 countries (China, France, Germany, Kuwait, Morocco, Oman, Qatar, Turkey, UK, US) • Generated personas for enhanced analysis of survey analytics for demographic and response views
  17. Data-driven persona systems afford (a) multiple analytical views and (b)

  18. Learn about a persona How to use? Understand the people

    you’re creating content for. Learn about your customer base: Who are they? Loyalty How big market segment do they represent? Sentiment Interests
  19. Predict customer reaction to content Can test out product prior

    to publishing See how product resonates with personas, from positively to negatively
  20. Customer growth potential How to use? Identify gaps in current

    and potential audience. Where is the growth potential? Discover growth potential by comparing your current reach to the total market potential of the segment. (e.g., Sachin has growth potential!)
  21. View how personas interact with content How to use? Analyze

    how well your content interests differences among customer segments. Learn what content is engaged with by different personas, to understand market preferences • Can filter by period • Variety of sorting options
  22. How are data-driven personas created?

  23. 1. 1. First, retrieve data from online channels (e.g., YouTube

  24. 2. 2. Then, generate personas from the data sources. Can

    have data-driven personas in a day! It’s that easy!
  25. Personification = nameless, faceless segments are turned into personas that

    describe a behavioral and demographic pattern in the data Literally, faces! Enrichment = enriching the persona profiles with additional information such as sentiment, loyalty, quotes, most viewed content, and topics of interest
  26. APG’s process relies on data dimensionality reduction (Non- negative matrix

    factorization [1]) [1] Jung, S., Salminen, J., Kwak, H., An, J., & Jansen, B. J. (2018). Automatic Persona Generation (APG): A Rationale and Demonstration. CHIIR ’18: Proceedings of the 2018 Conference on Human Information Interaction & Retrieval, 321–324. https://doi.org/10.1145/3176349.3176893
  27. None
  28. APG’s connections to Computer Science Challenge Potential solutions Image Generative

    Adversarial Networks (GANs) Persona Attributes Text Classification, Topic Modeling (LDA) Quotes Hate Speech Detection, Natural Language Processing (NLP) Persona Change Anomaly Detection, Concept Drift, Similarity Metrics, Tensor Factorization (TF)… Information Architecture User Studies, Crowd Experiments, Human-Computer Interaction (HCI), Adaptive / Intelligent Systems, User Modeling, Information Science (IS) Persona Evaluation Factor Analysis, Structural Equation Modeling (SEM), Experiments, User Experience (UX), Usability and User Interface (UI) Design
  29. Data-driven personas have room for all lines of research •

    Algorithmically oriented people can solve algorithmic problems in generation, validation, updating, etc. • Qualitatively oriented people can contribute via user studies (e.g., observation, interviews, case studies) • Empirically oriented people can conduct experiments using real systems and controlled conditions • Theoretically oriented people can attempt to formulate theories of personas and persona-user interaction …join the family? ☺
  30. Dispelling the myth of algorithms for personas • ”APG is

    not ‘all powerful,’ meaning it does not apply to all cases. If the data is not in the right format, better to use some other technique for persona generation. If the data is too little, again better to use something else. If the stakeholders’ information needs don’t correspond with what APG can output, the same – another method is needed. • Lately, I’ve formulated a perspective that (a) ANY data can be used for persona creation (just the technique for doing so needs to vary), and (b) ANYONE can generate personas. • To explore this idea further, in my recent class, I had 27 students create personas from an Excel dataset. Despite most of them not even having heard of personas before the class, they did surprisingly well. • This supports my idea that persona generation is a very inclusive practice in which everyone can participate – it’s not exclusive for algorithms and data scientists. Even simple Excel analysis of tabular data can produce insights towards the creation of actionable personas. • ...so, to summarize: there is no way optimal way for data-driven personas. And data-driven personas do not necessarily require a high degree of technical sophistication. Genuine interest and common sense suffice.”
  31. The fundamental issue about the Algorithm and persona… • Is

    clustering or dimensionality reduction meaningful for user segmentation in the first place? • From a conceptual perspective, NO. • These algorithms were developed for statistical data compression, not for user segmentation. • Algorithms can easily miss important information, or focus on the most frequent user groups (actually, people can do; problem formulation & persona definition).
  32. Specific issues with algorithms 1. The objective is not explicitly

    user segmentation but data compression 2. (user segmentation = find groups that correspond to organizational goals; data compression = show groups that have high within-group and low inter-group similarity) 3. Cannot pinpoint exactly how specific information was chosen for the persona profile (explainability / transparency) 4. Require high level of skill and sophistication to use (e.g., hyperparameter optimization) 5. Tend to emphasize data properties and ”fit” over subject-matter expertise
  33. Anyone can create data-driven personas! • …it is not only

    the job of algorithms • In fact, algorithms may do *worse* job than humans • Humans have common sense, and we understand social meanings • Algorithms do not have common sense, and they have no clue about social meanings (Common sense can be both good and bad at times; our common sense can also bias us; but so can algorithms become biased)
  34. Let’s try it, then!

  35. Now, the workshop assignment… • Joni briefs the business scenario

    • Joni shares the data • Joni shows useful Excel functions (also work with Google Sheets) • …after that, there’s 2 hours of independent working, after which we go through the created personas (*free format*).
  36. Scenario You work as a marketing assistant for an imaginary

    startup company called “Fit4EveryDay”. The company offers a personalized fitness routines for people of all ages. The mobile app offered by the company is free, and the company earns money by selling wellness products such as yoga mats, nutrients, and free weights via its app. Based on their spending in the app, each customer is categorized into “Low”, “Average”, or “High” spending class. The company has two goals: • Maximize the overall sales by focusing on the highest spending customers • Seek ways to activate the lowest spending customers Your task is to create personas that describe (a) highest spending and (b) lowest spending customers. Personas are fictitious people that describe distinct user or customer segments. These personas will then be used by stakeholders in your company to develop new features, services, and product offerings that address customer needs.
  37. Tools • Data analysis (Google Sheets or Excel): https://docs.google.com/spreadsheets/u/0/ •

    Artificial pictures (This Person Does Not Exist): https://this-person- does-not-exist.com/en • Names (GAN2Name): https://acua.qcri.org/tool/GAN2Name • Artificial quotes (GPT-J): https://huggingface.co/EleutherAI/gpt-j-6B • Template and data: https://bit.ly/persona-template-hcii
  38. Persona information design https://bit.ly/persona-data-hcii (Joni shows the Excel)

  39. Thank you! Get the book from Amazon! (or your library)