Slide 1

Slide 1 text

Skeltrack - Open Source Skeleton Tracking Joaquim Rocha, Igalia LinuxTag 2012 - Wunderbare Berlin

Slide 2

Slide 2 text

Guten Tag! ✩ I am a developer at Igalia ✩ I like doing innovative stuff like OCRFeeder and SeriesFinale ✩ and today I am presenting my latest project: Skeltrack

Slide 3

Slide 3 text

The Kinect

Slide 4

Slide 4 text

Microsoft's Kinect was the first camera with a price affordable to the public

Slide 5

Slide 5 text

The USB connection is open and thus hackable

Slide 6

Slide 6 text

This originated Open Source projects like the libfreenect, a library to control the Kinect device and get its information

Slide 7

Slide 7 text

We created a GLib wrapper for libfreenect called GFreenect

Slide 8

Slide 8 text

GFreenect offers asynchronous functions (and some synchronous as well) and makes it easy to use with other GNOME technologies

Slide 9

Slide 9 text

GObject Introspection = free bindings (Python, Javascript, Vala)

Slide 10

Slide 10 text

Kinect has a structured light camera which gives depth information

Slide 11

Slide 11 text

But that's raw information... values from 0-2048

Slide 12

Slide 12 text

libfreenect/GFreenect can give those values in mm

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Still...

Slide 15

Slide 15 text

It does NOT tell you there is a person in the picture

Slide 16

Slide 16 text

Or a cow

Slide 17

Slide 17 text

Or an ampelmann

Slide 18

Slide 18 text

Let alone a skeleton and where its joints are

Slide 19

Slide 19 text

For this you need a skeleton tracking solution

Slide 20

Slide 20 text

Three proprietary/closed solutions exist:

Slide 21

Slide 21 text

Microsoft Kinect SDK: non-commercial only

Slide 22

Slide 22 text

OpenNI: commercial compatible

Slide 23

Slide 23 text

Kinect for Windows: commercial use allowed but incompatible with the XBox's Kinect

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Conclusion: There were no Free solutions to perform skeleton tracking... :(

Slide 26

Slide 26 text

So Igalia built one!

Slide 27

Slide 27 text

Enter Skeltrack

Slide 28

Slide 28 text

What we wanted: ✩ A shared library, no fancy SDK ✩ Device independent ✩ No pattern matching, no databases ✩ Easy to use (everybody wants that!)

Slide 29

Slide 29 text

Not as easy as it sounds!

Slide 30

Slide 30 text

After some investigation we found Andreas Baak's paper "A Data-Driven Approach for Real-Time Full Body Pose Reconstruction from a Depth Camera"

Slide 31

Slide 31 text

However this paper uses a database of poses to get what the user is doing

Slide 32

Slide 32 text

So we based only part of our work on it

Slide 33

Slide 33 text

How does it work?

Slide 34

Slide 34 text

First we need to find the extremas

Slide 35

Slide 35 text

Make a graph whose nodes are the depth pixels

Slide 36

Slide 36 text

Connect two nodes if the distance is less than a certain value

Slide 37

Slide 37 text

Connect the different graph's components by using connected-component labeling

Slide 38

Slide 38 text

Choose a starting point and calculate Dijkstra to each point of the graph; choose the furthest point. There you got your extrema!

Slide 39

Slide 39 text

Then create an edge between the starting point and the current extrema point with 0 cost and repeat the same process now using the current extrema as a starting point.

Slide 40

Slide 40 text

This comes from Baak's paper and the difference starts here: choosing the starting point

Slide 41

Slide 41 text

Baak chooses a centroid as the starting point We choose the bottom-most point starting from the centroid (this showed better results for the upper body extremas)

Slide 42

Slide 42 text

So we got ourselves some extremas! What to do with them?

Slide 43

Slide 43 text

What extrema is a hand, a head, a shoulder?

Slide 44

Slide 44 text

For that we use educated guesses...

Slide 45

Slide 45 text

We calculate 3 extremas

Slide 46

Slide 46 text

Then we check each of them hoping they are the head

Slide 47

Slide 47 text

How?

Slide 48

Slide 48 text

For each extrema we look for the points in places where the shoulders should be, checking their distances between the extrema and between each other.

Slide 49

Slide 49 text

If they obey those rules then we assume they are the head'n'shoulders (tm)

Slide 50

Slide 50 text

With the remaining 2 extremas, we will try to see if they are elbows or hands

Slide 51

Slide 51 text

How to do it?

Slide 52

Slide 52 text

Calculate Dijkstra from the shoulders to each extrema

Slide 53

Slide 53 text

The closest extrema to any of the shoulders is either a hand of an elbow of that shoulder

Slide 54

Slide 54 text

How to check if it's a hand or an elbow?

Slide 55

Slide 55 text

If the distance between the extrema and the shoulder is less than a predefined value, then it is an elbow. Otherwise it is a hand.

Slide 56

Slide 56 text

If it is a hand, we find the elbow by choosing the first point (in the path we created with Dijkstra before) whose distance exceeds the elbow distance mentioned before

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

There is still some things missing...

Slide 59

Slide 59 text

Future work

Slide 60

Slide 60 text

Hands from elbows: If one of the extremas is an elbow, we need to infer where the hand is

Slide 61

Slide 61 text

Smoothing: Smooth the jittering of the joints

Slide 62

Slide 62 text

Robustness: Use restrictions to ignore objects that are not the user

Slide 63

Slide 63 text

Multi-user: Track more than one person at a time

Slide 64

Slide 64 text

And of course, get the rest of the joints: hips, knees, etc.

Slide 65

Slide 65 text

How to use it?

Slide 66

Slide 66 text

Asynchronous API

Slide 67

Slide 67 text

SkeltrackSkeleton *skeleton = SKELTRACK_SKELETON (skeltrack_skeleton_new ()); skeltrack_skeleton_track_joints (skeleton, depth_buffer, buffer_width, buffer_height, NULL, on_track_joints, NULL);

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

Synchronous API

Slide 70

Slide 70 text

SkeltrackJointList list; list = skeltrack_skeleton_track_joints_sync (skeleton, depth_buffer, buffer_width, buffer_height, NULL, NULL);

Slide 71

Slide 71 text

Skeleton Joint: ID: HEAD, LEFT_ELBOW, RIGHT_HAND, ... x: X coordinate in real world (in mm) y: Y coordinate in real world (in mm) screen_x: X coordinate in the screen (in pixels) screen_y: Y coordinate in the screen (in pixels)

Slide 72

Slide 72 text

Code/Bugs: https://github.com/joaquimrocha/Skeltrack

Slide 73

Slide 73 text

Nifty Tools for Development: GFreenect: https://github.com/elima/GFreenect GFreenect Utils: https://github.com/joaquimrocha/gfreenect-utils

Slide 74

Slide 74 text

GFreenect Python Example

Slide 75

Slide 75 text

Tool: record-depth-file

Slide 76

Slide 76 text

Tool: depth-file-viewer

Slide 77

Slide 77 text

Questions?

Slide 78

Slide 78 text

Creative Commons pictures from flickr: Kinect: Auxo.co.kr Ampelmann: echiner1 Kid Playing: Rob Welsh Skeleton: Dark Botxy