Subjective Operator Experience, and Task Performance in Teleoperation of a Social Robot (CHI ’24 Full Paper) จɿhttps://dl.acm.org/doi/full/10.1145/3613904.3642561 ϓϨεϦϦʔεɿhttps://www.cyberagent.co.jp/news/detail/id=29980 ֓ཁಈըɿhttps://www.youtube.com/watch?v=IEAr3WpYNIU ൃදಈըɿhttps://www.youtube.com/watch?v=5so3PTDnWsk Nami Ogawa, Jun Baba, Junya Nakanishi
| Results | Discussion | Conclusion Improving Teleoperation Experience of Social Robots • Aim: to support teleoperators to ‘ speak as the robot ’ 16 Hello. How can I help you? Teleoperator Social Robot Hello. How can I help you?
AAF • Aim: to support teleoperators to ‘ speak as the robot ’ • Idea: to use Altered Auditory Feedback (AAF) 17 to transform acoustic traits of speech and feed it back to the speaker transform Hello. How can I help you? transform & feedback Hello. How can I help you? Hello. How can I help you? Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
AAF • Aim: to support teleoperators to ‘ speak as the robot ’ • Idea: to use Altered Auditory Feedback (AAF) 18 to transform acoustic traits of speech and feed it back to the speaker transform Hello. How can I help you? transform & feedback Hello. How can I help you? Hello. How can I help you? • Hypothesis: AAF can transform self-representation towards ‘ becoming the robot. ’ Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
എ͕ߴ͍ΞόλʔΛ͏ͱɺڧؾͳަবΛ͢Δ[2] [1]: Rosenberg, R. S., Baughman, S. L., & Bailenson, J. N. (2013). Virtual superheroes: Using superpowers in virtual reality to encourage prosocial behavior. PloS one, 8(1), e55003. [2]: Yee, N., & Bailenson, J. (2007). The Proteus effect: The effect of transformed self-representation on behavior. Human communication research, 33(3), 271-290. ԕִϩϘοτ٬ • ΩϟϥΫλͱͯ͠ৼΔ͏ۀ • VTuberɺςʔϚύʔΫɺ… • →ࣗݾදΛมԽͤ͞ΒΕͳ͍͔ʁ Ξϓϩʔνͷֶࡍੑɾಠੑ 19 ʮԕִϩϘοτ٬ʯͱ͍͏࣮༻తγφϦΦʹɺ৺ཧֶతݟΛԠ༻
et al. “Voice in Human–Agent Interaction: A Survey,” ACM Computing Surveys (2021) incl. acoustic features, style of speech, linguistic content Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion A mismatch between the robot ’ s voice and appearance reduces user acceptance [1].
Customers Voice Transformer Mic input • requires skill and effort • simple voice transformation often used in practice • fully automatic, natural, and real-time speech conversion not yet perfect Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
Altered Auditory Feedback (AAF) Operator Robot Customers Voice Transformer VT-AAF Mic input • feeding ‘the robot-like’ transformed voice back to the operator • to elicit the ability to speak as if they were the robot’s character • requires skill and effort • simple voice transformation often used in practice • fully automatic, natural, and real-time speech conversion not yet perfect Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
the emotional tone (e.g., happiness, sadness, or fear) elicits the congruent emotional state during speech [3] and conversation [4] • Manipulated change in voice attributed as one ’ s own 24 [3] Aucouturier et al. "Covert digital manipulation of vocal emotion alter speakers’ emotional states in a congruent direction,” PNAS (2016) [4] Costa et al. “Regulating Feelings During Interpersonal Conflicts by Changing Voice Self-perception,” (CHI ’18) adapted from [3] Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
of the voice affects the speaker ’ s self-representation [5]. • Virtual reality studies: • Visual (adult or child avatar) and auditory (real or child-like transformed voice) congruence is important [6]. • Change in self-representation affects one ’ s behavior, abilities, and thinking [7]. 25 A “ [5] Arakawa et al. “Digital Speech Makeup: Voice Conversion Based Altered Auditory Feedback for Transforming Self-Representation,” (ICMI ’21) [6] Tajadura-Jiménez et al. "Embodiment in a Child-Like Talking Virtual Body Influences Object Size Perception, Self-Identification, and Subsequent Real Speaking,” Sci. Rep. (2017) Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion [7] Yee et al. "The Proteus Effect: The Effect of Transformed Self-Representation on Behavior,” Hum. Commun. Res.,. (2007) [5] [6]
transform the operator ’ s self-representation toward ‘ becoming the robot ’ ? • [RQ2: Subjective Task Evaluation] • Does AAF make it subjectively easier for the operator to perform the service? • [RQ3: Objective Task Performance] • Does AAF improve service performance objectively? 27 • in social robot teleoperation in a service context • with AAF that transforms the operator’s voice to match the robot’s representation Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
entrance of a bakery • two aspects: to speak a lot (Service) and to speak as the robot (Roleplay) • Participants: N=30 • Gender: 15 Female, 15 Male • Age: 38.00 ± 13.19 (SD), from 21 to 58 years old 28 Operator ’ s equipment Teleoperating interface on a web browser Robot placed at a bakery Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
(ROLAND VT-4) that can shift pitch and formants in real-time (end-to-end latency ~5ms) **: perceived gender and age from appearance are based on the ABOT Database [7] 29 No-VT VT-only Real-time Altered Auditory Feedback (AAF) Operator Robot Customers Voice Transformer* VT-AAF Mic input • transform acoustic traits (i.e., pitch and formants) of the voice to match the appearance (i.e., gender and age) of the robot** • transformed voice fed back to the participant in real-time • participants’ speech is output from the robot as is Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion [7] Phillips et al. “What is Human-like?: Decomposing Robots' Human-like Appearance Using the Anthropomorphic roBOT (ABOT) Database,” (HRI ’18)
(ROLAND VT-4) that can shift pitch and formants in real-time (end-to-end latency ~5ms) **: perceived gender and age from appearance are based on the ABOT Database [7] 30 No-VT VT-only Real-time Altered Auditory Feedback (AAF) Operator Robot Customers Voice Transformer* VT-AAF Mic input • transform acoustic traits (i.e., pitch and formants) of the voice to match the appearance (i.e., gender and age) of the robot** • transformed voice fed back to the participant in real-time • participants’ speech is output from the robot as is Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion [7] Phillips et al. “What is Human-like?: Decomposing Robots' Human-like Appearance Using the Anthropomorphic roBOT (ABOT) Database,” (HRI ’18)
(ROLAND VT-4) that can shift pitch and formants in real-time (end-to-end latency ~5ms) **: perceived gender and age from appearance are based on the ABOT Database [7] 31 No-VT VT-only Real-time Altered Auditory Feedback (AAF) Operator Robot Customers Voice Transformer* VT-AAF Mic input • transform acoustic traits (i.e., pitch and formants) of the voice to match the appearance (i.e., gender and age) of the robot** • transformed voice fed back to the participant in real-time • participants’ speech is output from the robot as is Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion [7] Phillips et al. “What is Human-like?: Decomposing Robots' Human-like Appearance Using the Anthropomorphic roBOT (ABOT) Database,” (HRI ’18)
Change in Self-Representation: RQ1 • Task Evaluation: RQ2 • Voice Ownership & Agency: RQ1 • NASA-TLX (mental workload): RQ2 • General Preference: RQ2 • Objective • Implicit Association Test (IAT) : RQ1 • Audio & Video Recordings (Speech Analysis) • Vocal Production: RQ1 • Amount of Conversation and Speech: RQ3 32 Category Scale Item Robot Embodiment Ownership I felt as if the robot's body was my body. Agency The movements of the robot's body were caused by my speaking. Change in Self- Representation FeltChild I felt like a child. FeltRobot I felt like a robot. FeltExtraverted I felt more extroverted than usual. RobotExtraversion1 I see the robot I played as extraverted, enthusiastic. RobotExtraversion2 I see the robot I played as reserved, quiet. Task Evaluation Enjoyment I enjoyed serving the customers. Motivation I was motivated to serve customers. RoleplayEase To speak as Sota was easy. RoleplayConfidence I could speak as Sota with confidence. RoleplaySatisfaction I could speak as Sota satisfactorily. ServiceEase To speak a lot of variety and quantity was easy. ServiceConfidence I could speak a lot of variety and quantity with confidence. ServiceSatisfaction I could speak a lot of variety and quantity satisfactorily. Voice Ownership and Agency OwnVoice I felt as if the voice I heard when I spoke was mine. VoiceFeatures I felt as if the voice I heard when I spoke resembled my (real) voice in terms of tone, pitch, or other acoustical features. VoiceAgency I felt as if I caused the voice I heard. Questionnaire Items Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
Arakawa et al. “Digital Speech Makeup: Voice Conversion Based Altered Auditory Feedback for Transforming Self-Representation,” (ICMI ’21) [6] Tajadura-Jiménez et al. "Embodiment in a Child-Like Talking Virtual Body Influences Object Size Perception, Self-Identification, and Subsequent Real Speaking,” Sci. Rep. (2017) • Implicit association test (IAT) • Questionnaire • voice ownership Measures used in [5]: Measures used in [6]: • IAT • Vocal Production • F0 (≒ pitch) of participants' speech • Questionnaire • body ownership and agency • experience of being a child • voice ownership and agency Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
analysis but not IAT 34 *: asked only in the VT-AAF condition Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion Measurements Hypotheses (in VT-AAF compared to the others) Results Subj. Questionnaire Robot Embodiment strong embodiment over a robot ✅ (partiy) Ownership: No-VT, VT-only < VT-AAF Agency: n.s. Change in Self- representation change in self-representation towards a robot ✅ No-VT, VT-only < VT-AAF Voice Ownership and Agency* ownership and agency over AAF ✅ (partly) Ownership: Low Agency: Moderate Obj. Implicit Association Test (IAT) strong self-association with a childlike robot n.s. F0 (≒ pitch) of the participants' speech shift toward F0 of the feedback ✅ VT-only < No-VT, VT-AAF
in fl uenced by voice conditions 36 Measurements Hypotheses (in VT-AAF compared to the others) Results Amount of Conversation and Speech Conversational Duration duration increases n.s. Amount of Speech amount increases n.s. Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion •1. Total duration of conversation (w/ local users) •how much operator's speech attracted local users •2. Total word count of operator ’ s speech •how much an operator was motivated to speak
a social robot • 2) Demonstration of aspects of AAF that benefit the operator through a field experiment • 3) Design implications for teleoperation interface 38 Contributions: Teleoperator Social Robot Hello. How can I help you? transform & feedback Hello. How can I help you? Hello. How can I help you? Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
| Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion • 1) Application of AAF with practical scenario: teleoperation of a social robot • 2) Demonstration of aspects of AAF that benefit the operator • subjective operator experience … ✅ • task performance … n.s. • 3) Design implications for teleoperation interface Teleoperator Social Robot Hello. How can I help you? transform & feedback Hello. How can I help you? Hello. How can I help you?
human-like social communication Teleoperation: How can technology support teleoperators? remote manipulation of a robot by a human operator • can offer natural and compelling communication • support for operators necessary for this to spread • promising in service fields • airport, cafe, hotel, shopping mall, etc. • fully-autonomous type yet to be perfected Overview | Background | Approach | Related Work | Experiment | Results | Discussion | Conclusion
a social robot • 2) Demonstration of aspects of AAF that benefit the operator through a field experiment • 3) Design implications for teleoperation interface 53 Contributions: Design Implications/Guidelinesͱ͍͏จԽ ↑HCIจԽʁ
(UIST 2007): Evaluating User Interface Systems Research (https://doi.org/ 10.1145/1294211.1294256). • - Saul Greenberg and Bill Buxton (2008): Usability evaluation considered harmful (some of the time) (https://doi.org/10.1145/1357054.1357074) • - David Ledo, Steven Houben, Jo Vermeulen, Nicolai Marquard, Lora Oehlberg & Saul Greenberg (2018): Evaluation Strategies for HCI Toolkit Research (https://doi.org/10.1145/ 3173574.3173610) • - James Fogarty (2017): Code and Contribution in Interactive Systems Research (https:// homes.cs.washington.edu/~jfogarty/publications/workshop-chi2017-codeandcontribution.pdf) 64