Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HCI on Music AI

Taein Kim
November 23, 2022

HCI on Music AI

This is a summary presentation on the paper: "Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models" wrtten by Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, Carrie J. Cai on CHI 2020.

Taein Kim

November 23, 2022
Tweet

More Decks by Taein Kim

Other Decks in Research

Transcript

  1. HCI on Music AI Nov 23, 2022 Taein Kim ([email protected])

    Department of Electronic Engineering Inha University, South Korea Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, Carrie J. Cai (CHI '20)
  2. 2

  3. HCI on Music AI Nov 23, 2022 Taein Kim ([email protected])

    Department of Electronic Engineering Inha University, South Korea Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, Carrie J. Cai (CHI '20)
  4. Introduction 5 Motivation Despite substantial research improving algorithmic performance, it's

    unclear what actual interactive tooling people need when co- creating with generative models, especially novices. https://www.google.com/doodles/celebrating-johann-sebastian-bach Human (novice) https://magenta.tensorflow.org/music-transformer Music Transformer Bach Doodle
  5. Introduction 6 Needfinding Study Information Overload: Generating all voices and

    timesteps is too much when corrections are needed User's input melody Harmonizing all at once
  6. Cococo 9 Cococo: augments conventional generative music interfaces with a

    set of AI steering tools Voice lanes Semantic sliders Example-based slider Multiple choices with preview playback https://pair-code.github.io/cococo/
  7. 11 Cococo A technique for AI-Steering Tools Infill mask Desired

    area "Give me music similar to this part"
  8. 12 Cococo A technique for AI-Steering Tools Infill mask Desired

    area "Give me music similar to this part" higher probability lower probability 𝑝𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑥𝑣,𝑡 𝑥𝐶 ∝ 𝑝𝑐𝑜𝑐𝑜𝑛𝑒𝑡 𝑥𝑣,𝑡 𝑥𝐶 𝑝𝑠𝑜𝑓𝑡𝑝𝑟𝑖𝑜𝑟 𝑥𝑣,𝑡 (𝑣: 𝑣𝑜𝑖𝑐𝑒, 𝐶: 𝑡ℎ𝑒 𝑠𝑒𝑡 𝑜𝑓 𝑣, 𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠 𝑚𝑎𝑘𝑖𝑛𝑔 𝑐𝑜𝑛𝑡𝑒𝑥𝑡) Soft prior: distribution that precision and variance are non-zero and finite [Link]
  9. 13 Cococo A technique for AI-Steering Tools Infill mask Desired

    area "Give me music similar to this part" 𝑝𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑥𝑣,𝑡 𝑥𝐶 ∝ 𝑝𝑐𝑜𝑐𝑜𝑛𝑒𝑡 𝑥𝑣,𝑡 𝑥𝐶 𝑝𝑠𝑜𝑓𝑡𝑝𝑟𝑖𝑜𝑟 𝑥𝑣,𝑡 (𝑣: 𝑣𝑜𝑖𝑐𝑒, 𝐶: 𝑡ℎ𝑒 𝑠𝑒𝑡 𝑜𝑓 𝑣, 𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑠 𝑚𝑎𝑘𝑖𝑛𝑔 𝑐𝑜𝑛𝑡𝑒𝑥𝑡) Soft prior: distribution that precision and variance are non-zero and finite Why they used soft prior? • Encouraged notes are encouraged to have higher probability • Discouraged notes have lower, but non-zero probability • Very probable notes in the model’s original sampling distribution can still be likely after incorporating with priors • Even the priors are specified for particular voice and time steps, their effects can propagate to other parts of the piece • Smooth transitions can be generated with/w.o soft priors • These features can help AI to maintain the overall direction
  10. 14 Cococo A technique for AI-Steering Tools Infill mask Desired

    area "Give me music similar to this part" Soft prior: distribution that precision and variance are non-zero and finite Similar/Different: Create a soft prior having higher/lower probabilities for notes in the example Major/Minor: Construct a soft prior by asking what is the most likely major/minor triad at each time slice within the model's sampling distribution (ex: C major triad will include all the Cs, Es, and Gs in all octaves) Conventional/Surprising: Makes sampling distribution more/less "peaky" for notes to be sampled that had higher/lower probabilities in the original distribution
  11. Experiment 15 V.S 21 participants with music composition novices •

    12 females, 9 males • Age 20 to 52 (mean=31) • Compose music reflects the character and mood of the provided card images • RQ1: How do the AI-steering tools affect user perceptions of the creative process and the creative artifacts made with the AI? • RQ2: How do music novices apply the AI-steering tools in their creative process? What patterns of use and strategies arise? User study
  12. Experiment 16 Quantitative findings Q: "I felt I was able

    to express my creative goals in the composition" Q: "I felt like I was collaborating with the system" Results from post-study survey comparing the conventional interface and Cococo, with standard error bars (Higher means strongly agree, 4 means neutral) … could be adjusted to do what I would like … has a clear understanding of who is in control Brilliant composer to outsource work to Black box Take-it or Leave-it V.S "Machine is doing all the work" "I feel more useful as a composer"
  13. Experiment 17 Qualitative findings Common Patterns of using Voice Lanes,

    visualized using interaction data from 4 archetypal participants (darker-colored segments were performed by users before lighter-colored segments): (A) Voice-by voice (most common) (B) Temporal Chunks (C) Combination of Voiceby-Voice and Temporal Chunks (D) Ad-hoc Bits "intervene after [the AI] generated [content]... Stop it in the middle... And change it to feel different, before it kept going" (P14) "The Multiple Alternatives functionality naturally lent itself to this "generate and audition" strategy of music composition."
  14. Experiment 18 Qualitative findings Cococo: "Brick-building" helped them identify glitches

    later on "Aha! The dissonance is coming from the alto voice in measure 1!" Conventional: Less familiar with the piece when "sculpting" "Hard to disentangle what causes what" Working Bit-by-Bit helped users "Debug" the music
  15. Experiment 19 Qualitative findings Users re-purposed the tools hoping to

    overcome AI limits Go-to primitives for expressing musical goals: • Pitch • Note length • Rising/falling shape • Silences ???
  16. Discussion 20 Onboarding and increasing AI transparency: Novices were sometimes

    hesitant to make local edits for fear of adversely affecting the AI's global optimization → Add signs to guide them appropriately Bridging user strategies with the AI: Novices were already prepared interacting with Cococo and making own strategy → Adding more intuitive sliders can help them making more variant music Effective co-creation with semantically-meaningful tools: While sophisticated generative DNNs can create a full artifact, their capabilities may need to be partitioned into smaller, semantically meaningful tools to promote effective co-creation → Soft prior can help this Defining the human-AI partnership: To make people perceive the AI as a responsive collaborator, interfaces should empower them to define the creative objective intuitively
  17. Conclusion 21 • Cococo augments conventional generative music interfaces with

    a set of AI steering tools • In evaluation, the tools improved: • composing experience • attitude towards the AI • ownership of the music • Unexpectedly, the tools also helped users with: • building bit-by-bit • "debugging" the music • overcoming AI limits ???