Kinect has a time-of-flight (ToF) camera which gives depth information
Slide 10
Slide 10 text
But that's raw information... values from 0-2048
Slide 11
Slide 11 text
libfreenect/GFreenect recently can give those values in mm
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
Still...
Slide 14
Slide 14 text
It does NOT tell you there is a person in the picture
Slide 15
Slide 15 text
Or a monkey
Slide 16
Slide 16 text
Or a cow
Slide 17
Slide 17 text
Let alone a skeleton and where its joints are
Slide 18
Slide 18 text
For this you need a skeleton tracking solution
Slide 19
Slide 19 text
Three proprietary/closed solutions exist:
Slide 20
Slide 20 text
Microsoft Kinect SDK: non-commercial only
Slide 21
Slide 21 text
OpenNI: commercial compatible
Slide 22
Slide 22 text
Kinect for Windows: commercial use allowed
but incompatible with the XBox's Kinect
Slide 23
Slide 23 text
No content
Slide 24
Slide 24 text
Conclusion: There were no Free solutions to
perform skeleton tracking... :(
Slide 25
Slide 25 text
So Igalia built one!
Slide 26
Slide 26 text
Enter Skeltrack
Slide 27
Slide 27 text
What we wanted:
✩ A shared library, no fancy SDK
✩ Device independent
✩ No pattern matching, no databases
✩ Easy to use (everybody wants that!)
Slide 28
Slide 28 text
Not as easy as it sounds!
Slide 29
Slide 29 text
After some investigation we found Andreas Baak's
paper "A Data-Driven Approach for Real-Time Full
Body Pose Reconstruction from a Depth Camera"
Slide 30
Slide 30 text
However this paper uses a database of
poses to get what the user is doing
Slide 31
Slide 31 text
So we based our work on it until
the part of getting the extremas
Slide 32
Slide 32 text
How does it work?
Slide 33
Slide 33 text
First we need to find the extremas
Slide 34
Slide 34 text
Make a graph whose nodes are the depth pixels
Slide 35
Slide 35 text
Connect two nodes if the distance is less than a
threshold
Slide 36
Slide 36 text
Connect the different graph's components by using
connected-component labeling
Slide 37
Slide 37 text
Choose a starting point and calculate Dijkstra to
each point of the graph, choose the furthest point:
there you got your extrema!
Slide 38
Slide 38 text
Then create an edge between the starting point
and the current extrema point with 0 cost and
repeat the same process now using the current
extrema as a starting point.
Slide 39
Slide 39 text
This comes from Baak's paper and the difference
starts here: choosing the starting point
Slide 40
Slide 40 text
Baak chooses a centroid as the starting point
We choose the bottom-most point starting from the
centroid (this showed better results for the upper
body extremas)
Slide 41
Slide 41 text
So we got ourselves some extremas!
What to do with them?
Slide 42
Slide 42 text
What extrema is a hand, a head, a shoulder?
Slide 43
Slide 43 text
For that we use educated guesses...
Slide 44
Slide 44 text
We calculate 3 extremas
Slide 45
Slide 45 text
Then we check each other hoping they are the head
Slide 46
Slide 46 text
How?
Slide 47
Slide 47 text
For each extrema we look for the points in places
where the shoulders should be, checking their distances
between the extrema and between each other.
Slide 48
Slide 48 text
If they obey those rules then we assume they are
the head'n'shoulders (tm)
Slide 49
Slide 49 text
With the remaining 2 extremas, we will try to see if
they are elbows or hands
Slide 50
Slide 50 text
How to do it?
Slide 51
Slide 51 text
Calculate Dijkstra from the shoulders to each extrema
Slide 52
Slide 52 text
The closest extrema to any of the shoulders is either a
hand of an elbow of that shoulder
Slide 53
Slide 53 text
How to check if it's a hand or elbow?
Slide 54
Slide 54 text
If the distance between the extrema and the shoulder is
less than a predefined value, then it is an elbow. Otherwise
it is a hand.
Slide 55
Slide 55 text
If it is a hand, we find the elbow by choosing the point in
the middle of the path we created with Dijkstra before
Slide 56
Slide 56 text
No content
Slide 57
Slide 57 text
There is still some things missing...
Slide 58
Slide 58 text
Future work
Slide 59
Slide 59 text
Hands from elbows: If one of the extremas is an elbow, we
need to infer where the hand is
Slide 60
Slide 60 text
Smoothing: Smooth the jittering of the joints
Slide 61
Slide 61 text
Robustness: Use restrictions to ignore objects that are not
the user
Slide 62
Slide 62 text
And of course, get the rest of the joints: hips, knees, etc.
Skeleton Joint:
ID: HEAD, LEFT_ELBOW, RIGHT_HAND, ...
x: X coordinate in real world (in mm)
y: Y coordinate in real world (in mm)
screen_x: X coordinate in the screen (in pixels)
screen_y: Y coordinate in the screen (in pixels)