$30 off During Our Annual Pro Sale. View Details »

Skeltrack: Open Source Skeleton Tracking

Skeltrack: Open Source Skeleton Tracking

Slides for the presentation given at Semana da Ciência e Tecnologia da Universidade de Évora, April 17th 2012.

Joaquim Rocha

April 18, 2012
Tweet

More Decks by Joaquim Rocha

Other Decks in Programming

Transcript

  1. Skeltrack - Open Source Skeleton Tracking
    Joaquim Rocha, Igalia
    Semana da Ciência e Tecnologia Univ. de Évora, April 2012

    View Slide

  2. The Kinect

    View Slide

  3. Microsoft's Kinect was the first camera
    with a price affordable to the public

    View Slide

  4. The USB connection is open and thus hackable

    View Slide

  5. This originated Open Source projects like the libfreenect,
    a library to control the Kinect device and get its information

    View Slide

  6. We created a GLib wrapper for libfreenect called GFreenect

    View Slide

  7. GFreenect offers asynchronous functions (and some synchronous as
    well) and makes it easy to use with other GNOME technologies

    View Slide

  8. GObject Introspection = free bindings (Python, Javascript, Vala)

    View Slide

  9. Kinect has a time-of-flight (ToF) camera which gives depth information

    View Slide

  10. But that's raw information... values from 0-2048

    View Slide

  11. libfreenect/GFreenect recently can give those values in mm

    View Slide

  12. View Slide

  13. Still...

    View Slide

  14. It does NOT tell you there is a person in the picture

    View Slide

  15. Or a monkey

    View Slide

  16. Or a cow

    View Slide

  17. Let alone a skeleton and where its joints are

    View Slide

  18. For this you need a skeleton tracking solution

    View Slide

  19. Three proprietary/closed solutions exist:

    View Slide

  20. Microsoft Kinect SDK: non-commercial only

    View Slide

  21. OpenNI: commercial compatible

    View Slide

  22. Kinect for Windows: commercial use allowed
    but incompatible with the XBox's Kinect

    View Slide

  23. View Slide

  24. Conclusion: There were no Free solutions to
    perform skeleton tracking... :(

    View Slide

  25. So Igalia built one!

    View Slide

  26. Enter Skeltrack

    View Slide

  27. What we wanted:
    ✩ A shared library, no fancy SDK
    ✩ Device independent
    ✩ No pattern matching, no databases
    ✩ Easy to use (everybody wants that!)

    View Slide

  28. Not as easy as it sounds!

    View Slide

  29. After some investigation we found Andreas Baak's
    paper "A Data-Driven Approach for Real-Time Full
    Body Pose Reconstruction from a Depth Camera"

    View Slide

  30. However this paper uses a database of
    poses to get what the user is doing

    View Slide

  31. So we based our work on it until
    the part of getting the extremas

    View Slide

  32. How does it work?

    View Slide

  33. First we need to find the extremas

    View Slide

  34. Make a graph whose nodes are the depth pixels

    View Slide

  35. Connect two nodes if the distance is less than a
    threshold

    View Slide

  36. Connect the different graph's components by using
    connected-component labeling

    View Slide

  37. Choose a starting point and calculate Dijkstra to
    each point of the graph, choose the furthest point:
    there you got your extrema!

    View Slide

  38. Then create an edge between the starting point
    and the current extrema point with 0 cost and
    repeat the same process now using the current
    extrema as a starting point.

    View Slide

  39. This comes from Baak's paper and the difference
    starts here: choosing the starting point

    View Slide

  40. Baak chooses a centroid as the starting point
    We choose the bottom-most point starting from the
    centroid (this showed better results for the upper
    body extremas)

    View Slide

  41. So we got ourselves some extremas!
    What to do with them?

    View Slide

  42. What extrema is a hand, a head, a shoulder?

    View Slide

  43. For that we use educated guesses...

    View Slide

  44. We calculate 3 extremas

    View Slide

  45. Then we check each other hoping they are the head

    View Slide

  46. How?

    View Slide

  47. For each extrema we look for the points in places
    where the shoulders should be, checking their distances
    between the extrema and between each other.

    View Slide

  48. If they obey those rules then we assume they are
    the head'n'shoulders (tm)

    View Slide

  49. With the remaining 2 extremas, we will try to see if
    they are elbows or hands

    View Slide

  50. How to do it?

    View Slide

  51. Calculate Dijkstra from the shoulders to each extrema

    View Slide

  52. The closest extrema to any of the shoulders is either a
    hand of an elbow of that shoulder

    View Slide

  53. How to check if it's a hand or elbow?

    View Slide

  54. If the distance between the extrema and the shoulder is
    less than a predefined value, then it is an elbow. Otherwise
    it is a hand.

    View Slide

  55. If it is a hand, we find the elbow by choosing the point in
    the middle of the path we created with Dijkstra before

    View Slide

  56. View Slide

  57. There is still some things missing...

    View Slide

  58. Future work

    View Slide

  59. Hands from elbows: If one of the extremas is an elbow, we
    need to infer where the hand is

    View Slide

  60. Smoothing: Smooth the jittering of the joints

    View Slide

  61. Robustness: Use restrictions to ignore objects that are not
    the user

    View Slide

  62. And of course, get the rest of the joints: hips, knees, etc.

    View Slide

  63. How to use it?

    View Slide

  64. SkeltrackSkeleton *skeleton = SKELTRACK_SKELETON (skeltrack_skeleton_new ());
    skeltrack_skeleton_track_joints (skeleton,
    depth_buffer,
    buffer_width,
    buffer_height,
    NULL,
    on_track_joints,
    NULL);

    View Slide

  65. View Slide

  66. Skeleton Joint:
    ID: HEAD, LEFT_ELBOW, RIGHT_HAND, ...
    x: X coordinate in real world (in mm)
    y: Y coordinate in real world (in mm)
    screen_x: X coordinate in the screen (in pixels)
    screen_y: Y coordinate in the screen (in pixels)

    View Slide

  67. Code/Bugs: https://github.com/joaquimrocha/Skeltrack

    View Slide

  68. Questions?

    View Slide

  69. Creative Commons pictures from flickr:
    Kinect: Auxo.co.kr
    Monkey: nothingtosay
    Kid Playing: Rob Welsh
    Skeleton: Dark Botxy

    View Slide