Skeltrack: Open Source Skeleton Tracking

Skeltrack: Open Source Skeleton Tracking

Slides for the presentation given at Semana da Ciência e Tecnologia da Universidade de Évora, April 17th 2012.

A0a1e3a9ca85502ca53f11819d236764?s=128

Joaquim Rocha

April 18, 2012
Tweet

Transcript

  1. Skeltrack - Open Source Skeleton Tracking Joaquim Rocha, Igalia Semana

    da Ciência e Tecnologia Univ. de Évora, April 2012
  2. The Kinect

  3. Microsoft's Kinect was the first camera with a price affordable

    to the public
  4. The USB connection is open and thus hackable

  5. This originated Open Source projects like the libfreenect, a library

    to control the Kinect device and get its information
  6. We created a GLib wrapper for libfreenect called GFreenect

  7. GFreenect offers asynchronous functions (and some synchronous as well) and

    makes it easy to use with other GNOME technologies
  8. GObject Introspection = free bindings (Python, Javascript, Vala)

  9. Kinect has a time-of-flight (ToF) camera which gives depth information

  10. But that's raw information... values from 0-2048

  11. libfreenect/GFreenect recently can give those values in mm

  12. None
  13. Still...

  14. It does NOT tell you there is a person in

    the picture
  15. Or a monkey

  16. Or a cow

  17. Let alone a skeleton and where its joints are

  18. For this you need a skeleton tracking solution

  19. Three proprietary/closed solutions exist:

  20. Microsoft Kinect SDK: non-commercial only

  21. OpenNI: commercial compatible

  22. Kinect for Windows: commercial use allowed but incompatible with the

    XBox's Kinect
  23. None
  24. Conclusion: There were no Free solutions to perform skeleton tracking...

    :(
  25. So Igalia built one!

  26. Enter Skeltrack

  27. What we wanted: ✩ A shared library, no fancy SDK

    ✩ Device independent ✩ No pattern matching, no databases ✩ Easy to use (everybody wants that!)
  28. Not as easy as it sounds!

  29. After some investigation we found Andreas Baak's paper "A Data-Driven

    Approach for Real-Time Full Body Pose Reconstruction from a Depth Camera"
  30. However this paper uses a database of poses to get

    what the user is doing
  31. So we based our work on it until the part

    of getting the extremas
  32. How does it work?

  33. First we need to find the extremas

  34. Make a graph whose nodes are the depth pixels

  35. Connect two nodes if the distance is less than a

    threshold
  36. Connect the different graph's components by using connected-component labeling

  37. Choose a starting point and calculate Dijkstra to each point

    of the graph, choose the furthest point: there you got your extrema!
  38. Then create an edge between the starting point and the

    current extrema point with 0 cost and repeat the same process now using the current extrema as a starting point.
  39. This comes from Baak's paper and the difference starts here:

    choosing the starting point
  40. Baak chooses a centroid as the starting point We choose

    the bottom-most point starting from the centroid (this showed better results for the upper body extremas)
  41. So we got ourselves some extremas! What to do with

    them?
  42. What extrema is a hand, a head, a shoulder?

  43. For that we use educated guesses...

  44. We calculate 3 extremas

  45. Then we check each other hoping they are the head

  46. How?

  47. For each extrema we look for the points in places

    where the shoulders should be, checking their distances between the extrema and between each other.
  48. If they obey those rules then we assume they are

    the head'n'shoulders (tm)
  49. With the remaining 2 extremas, we will try to see

    if they are elbows or hands
  50. How to do it?

  51. Calculate Dijkstra from the shoulders to each extrema

  52. The closest extrema to any of the shoulders is either

    a hand of an elbow of that shoulder
  53. How to check if it's a hand or elbow?

  54. If the distance between the extrema and the shoulder is

    less than a predefined value, then it is an elbow. Otherwise it is a hand.
  55. If it is a hand, we find the elbow by

    choosing the point in the middle of the path we created with Dijkstra before
  56. None
  57. There is still some things missing...

  58. Future work

  59. Hands from elbows: If one of the extremas is an

    elbow, we need to infer where the hand is
  60. Smoothing: Smooth the jittering of the joints

  61. Robustness: Use restrictions to ignore objects that are not the

    user
  62. And of course, get the rest of the joints: hips,

    knees, etc.
  63. How to use it?

  64. SkeltrackSkeleton *skeleton = SKELTRACK_SKELETON (skeltrack_skeleton_new ()); skeltrack_skeleton_track_joints (skeleton, depth_buffer, buffer_width,

    buffer_height, NULL, on_track_joints, NULL);
  65. None
  66. Skeleton Joint: ID: HEAD, LEFT_ELBOW, RIGHT_HAND, ... x: X coordinate

    in real world (in mm) y: Y coordinate in real world (in mm) screen_x: X coordinate in the screen (in pixels) screen_y: Y coordinate in the screen (in pixels)
  67. Code/Bugs: https://github.com/joaquimrocha/Skeltrack

  68. Questions?

  69. Creative Commons pictures from flickr: Kinect: Auxo.co.kr Monkey: nothingtosay Kid

    Playing: Rob Welsh Skeleton: Dark Botxy