Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[class project'26] Augmenting Air-Music Playing...

[class project'26] Augmenting Air-Music Playing into VR/MR

Fun class project on CS6682: Computation for Content Creation (taught by Prof. Abe Davis). It was great to foster my new interests in the creative area! https://www.cs.cornell.edu/courses/cs6682/2026sp/

Avatar for Haruka Kiyohara

Haruka Kiyohara

May 05, 2026

More Decks by Haruka Kiyohara

Other Decks in Education

Transcript

  1. CS6682: Final Project Demo Haruka Kiyohara May 2026 CS6682 Final

    Project: Music Playing Augmentation 1 Augmenting Air-Music Playing into VR/MR
  2. Motivation May 2026 CS6682 Final Project: Music Playing Augmentation 2

    z z z Wishing a magic to have fun with my air-music playing and dancing! Luna
  3. Why its hard to do this in VR/MR? • VR/MR

    needs real-time interaction. • Unlike offline processing, we cannot use “in the hindsight” (noncausal) signals (e.g., local minima). • Many VR/MR apps use hand controller. • Getting signals only from camera image is not easy. • Interaction latency matters. May 2026 CS6682 Final Project: Music Playing Augmentation 4 https://www.meta.com/quest/
  4. Why its hard to do this in VR/MR? • VR/MR

    needs real-time interaction. • Unlike offline processing, we cannot use “in the hindsight” (noncausal) signals (e.g., local minima). • Many VR/MR apps use hand controller. • Getting signals only from camera image is not easy. • Interaction latency matters. Can we enjoy light-weight VR/MR animation without expensive device? May 2026 CS6682 Final Project: Music Playing Augmentation 5 Laptop visual augmentation https://www.meta.com/quest/
  5. Overall plan for implementation • Step 0: Implementing assets, such

    as avator visuals and the output screen. May 2026 CS6682 Final Project: Music Playing Augmentation 6 Mr. chicken frames effects (e.g., rise, fadeout, zoomin)
  6. Overall plan for implementation • Step 0: Implementing assets, such

    as avator visuals and the output screen. • Step 1: Detecting the body (hand) movement from webcam images. • Step 2: Moving avator and visualizing effects corresponding to the pose. • Step 3: Run everything in a demo (and improve interaction latency) May 2026 CS6682 Final Project: Music Playing Augmentation 7 Mr. chicken frames effects (e.g., rise, fadeout, zoomin)
  7. Mediapipe provides a pre-trained ML model for real-time pose landmark

    tracking. Step 1: Pose tracking by May 2026 CS6682 Final Project: Music Playing Augmentation 8 figures: https://ai.google.dev/edge/mediapipe/solutions/vision/pose_landmarker
  8. The results of pose landmark tracking is as follows: Step

    1: Pose tracking by May 2026 CS6682 Final Project: Music Playing Augmentation 9 Pre-trained models are great..!
  9. Step 2: Transfering the pose landmark to avator move 1.

    Aligning the avator wing angle to the hand-elbow angle 2. Creat a new effect on the hand position (in webcam), when the arm velocity changed rapidly to a different angle May 2026 CS6682 Final Project: Music Playing Augmentation 10 ① previous velocity was larger than the threshold ③ velocity direction changes rapidly (i.e., cosine similarity is smaller than the threshold) Velocity(t) = Position(t) – Position(t – 3) Velocity(t – 5) Velocity(t) ② current hand position is within the frame
  10. Step 2: Transfering the pose landmark to avator move The

    results of pose transfer is as follows: May 2026 CS6682 Final Project: Music Playing Augmentation 11 It’s a bit noisy, but let’s go with this.. (there is room for improvement).
  11. Step 3-1: Putting all together (ofline version) May 2026 CS6682

    Final Project: Music Playing Augmentation 12 ① recording webcam inputs ② processing frames offline
  12. Step 3-1: Putting all together (ofline version) May 2026 CS6682

    Final Project: Music Playing Augmentation 13 ① recording webcam inputs ② processing frames offline ③ streaming frames with effects (effect = hand tracking mode)
  13. Step 3-2: Latency issue of online interaction • Naive implementation

    May 2026 CS6682 Final Project: Music Playing Augmentation 14 webcam detection animation webcam detection 4 fps terribly slow..!! animation rendering …
  14. Step 3-2: Latency issue of online interaction • Naive implementation

    May 2026 CS6682 Final Project: Music Playing Augmentation 15 webcam detection animation webcam detection 4 fps terribly slow..!! animation rendering …
  15. Step 3-2: Latency issue of online interaction • Naive implementation

    • Parallel computation-enabled implementatation May 2026 CS6682 Final Project: Music Playing Augmentation 16 webcam detection animation webcam detection 4 fps terribly slow..!! webcam detection animation webcam detection animation webcam detection animation Thread 1: Thread 2: Thread 3: 10 fps 12 fps 13 fps much faster..!! animation rendering rendering rendering … …
  16. Summary • I implemented “laptop VR”, which generates (cheap) avator

    move and effects corresponding to the human movement, such as hand waving or drumming. • There were two challenges in implementation: • Transferring pose landmark position changes to the avator move and effects. • Enabling smooth real-time interactions without much latency. • I could barely make the demo playable (w.r.t. accuracy and latency)! This is great, but there are indeed large room for improvement. • (It was fun to explore generative AI tools to prepare assets for the project, too!) May 2026 CS6682 Final Project: Music Playing Augmentation 19
  17. Special cast • Luna (character designed and generated using ;

    AI-anime creater) May 2026 CS6682 Final Project: Music Playing Augmentation 21 All figures of “Luna” are under my own credit.
  18. Songs • @Sunosuno07 (all songs are generated using ; AI-music

    composer) May 2026 CS6682 Final Project: Music Playing Augmentation 22 “Fly!” is under Suno’s credit. All other songs are under my own credit. • Fly! • Happy birthday! • Cotton Candy • At the top of the snow mountain • Start of the day • To the end of eclipse • Xin-nien Kuai-le (新年快乐) • Ma-tsu-ri (お祭りだぁーー!) In the demo, the songs were randomly sampled from:
  19. Avator • Mrs. chicken (visualized by ; Python image processing

    library) May 2026 CS6682 Final Project: Music Playing Augmentation 23 face: image from いらすとや repository feet: cv2.ellipse wing: cv2.ellipse body: cv2.polylines, cv2,fillpoly
  20. Visual and sound effects • Sounds effect: cc0 sounds from

    BigSoundBank • Visual effects: Base images from いらすとや repository (allow free use of up to 20 distinct images per content; いらすとや is pronounced i-ra-su-to-ya) May 2026 CS6682 Final Project: Music Playing Augmentation 24
  21. Thank you for listening! May 2026 CS6682 Final Project: Music

    Playing Augmentation 25 Augmenting Air-Music Playing into VR/MR – Fin.