Using AI for User Representation: An Analysis of 83 Persona Prompts

Using AI for User Representation An Analysis of 83 Persona
Prompts Joni Salminen, Danial Amin, and Bernard Jansen a a b a: School of Marketing and Communication, University of Vaasa, Finland b: Qatar Computing Research Institute, Hamad Bin Khalifa University, Qatar

Table of contents 01 04 02 05 03 06 Research
Context Research Gap & RQs Methodology Key Findings Insights and Recommendations Limitations and Future Directions

Research Context 01

Manual Persona Development Data-Driven Persona LLM Generated Personas Persona in
User Representation Fictitious user representations based on real data Central to user-centered design (UCD) and HCI Traditionally created through human analysis of user data Persona Persona Evolution

Research Gap & RQs 02

LLM Generated Personas LLM generated personas (a persona created by
LLM) are becoming more prominent. LLMs enable near-infinite prompting strategies No systematic analysis of how researchers use LLMs for personas Lack of evidence-based guidelines for safe and productive use

Research Questions Why do researchers use persona prompts? How do
researchers use persona prompts? What kind of personas do researchers generate with persona prompts? RQ1 RQ2 RQ3

Methodology 03

Searched six major academic databases (ACM, IEEE, Web of Science,
Scopus, arXiv) following SLR guidelines. Article Search Identified 52 relevant articles focused on generative AI’s role in persona development. Selection Extracted 83 usable persona prompts from 27 articles (52%), including prompt text and detailed usage context. Extraction Collaborative coding by two researchers on an established coding framework Data Analysis Research Methodology

Key Findings 04

Persona Generation 66.7% Prediction 21.2% Evaluation 12.1% RQ1: Why Use
Persona Prompts? Primary Uses Researchers primarily use persona prompts for generation , with emerging applications in prediction tasks and evaluation across diverse domains from education to climate communication. Primary Domains Education Design Marketing Storytelling Informatics Health Sustainability Communication

GPT 76% Others 18% DALL-E 6% Single LLM 78% Multiple
LLMs 22% Number of Words Count 0 50 100 150 200 250 300 350 0 2 4 6 8 10 RQ2: How do researchers use persona prompts? (1/2) LLM Usage Researchers predominantly use GPT models (76.1%) with multi-prompt strategies (avg. 3.1 prompts per study, range 1-12), while cross-model testing remains rare (only 22% of studies), and prompt complexity varies widely (22-309 words). Prompt Complexity Min:22 Mdn:67 Mean:107 Max: 309

RQ2: How do researchers use persona prompts? (2/2) Most researchers
integrate dynamic data or variables into prompts (74.1%), require structured outputs like JSON (51.85%), and use complex prompt orchestration techniques. 74% insert dynamic data or variables into prompts 52% require structured output format 19% assign facilitator roles to LLM 27% disclosed hyperparameter values

RQ3: What kind of personas do researchers generate with persona
prompts? (1/2) Researchers generate predominantly single, text-based personas with strong demographic emphasis, combining text and numbers while prioritizing brevity through explicit length constraints and rarely including images. Text+N um bers+Im age Text+N um bers Text only N um bers only 0 5 10 15 20 Content 64% specified number to generate Constraints 71% generated single persona only 41% included length constraints

RQ3: What kind of personas do researchers generate with persona
prompts? (2/2) LLM-generated personas average 5.48 information attributes (SD=3.51), which is 38% fewer than previous generation data-driven personas, with most falling in the “Simple” category (4-7 attributes) rather than information-rich profiles. 0 5 10 15 20 Very Simple Simple Moderate High Information Richness D em ographics C ontextual Inform ation Behaviors Sum m ary Attitudes 0 5 10 15 20 25 Information Categories (0-3 subcategories) (4-7 subcategories) (8-10 subcategories) (11+ subcategories)

Insights and Recommendations 05

Key Implications and Insights Continuation of Traditional Persona Deviations from
Traditional Persona Emerging Concerns for LLM Generated Personas Demographics remain the dominant information category, appearing in majority of prompt entries. Traditional persona attributes such as behaviors, attitudes, and contextual information continue to be preserved in LLM-generated personas. LLM-generated personas emphasize brevity over rich narratives. Images are rarely generated for personas. The majority of prompts integrate data or dynamic variables directly within the prompt structure. Researchers show a preference for structured outputs like JSON rather than narrative formats. Single persona generation dominates which limits the representation of user population diversity. Complex prompt chains reduce transparency and make systematic evaluation increasingly difficult. Cross-model validation (i.e. testing/using multiple models) remains limited. Personas are increasingly treated as data objects rather than tools for building empathy with users.

Recommendations Include primary user data in prompts (maintain data-driven principle)
Ground prompting strategies in persona theory Evaluate both individual prompts and system-level effects Generate diverse persona sets, not just single personas Test multiple LLM models for comparison

Future Research and Key Takeaways 06

Empirical studies on prompt design effects on persona outputs 1
Future Research 2 Evaluation frameworks for multi-prompt systems Algorithmic fairness and bias guidelines 3 4 Cross-model performance comparisons 5 Human-AI collaboration best practices

LLMs predominantly used for generation (81.5%), with emerging prediction use
(25.9%) Main Takeaways GPT models dominate (76.1%), but cross-model testing is rare Multi-prompt strategies common (avg. 3.1 prompts per study) Generated personas emphasize brevity and structure over rich narratives Demographics remain central (77.8% of entries) in LLM generated personas.

For more Information Paper Dataset Team

Using AI for User Representation: An Analysis o...

Using AI for User Representation: An Analysis of 83 Persona Prompts

Danial Amin

Other Decks in Research

Featured

Transcript