Improving Accuracy of AI Algorithms using Sensor Fusion (By: Zain Fuad) - DevFest 2021

Zain Fuad [email protected] Lahore Improving Accuracy of AI Algorithms Using
Sensor Fusion zainfuad2

Contents Introduction Problem Definition Proposed Solution Results Future Work 2

➢ HAR refers to acquiring a person’s body movements and
understanding the performed action Human Action Recognition (HAR)

HAR - Applications Health Care Sports Security Automation Robotics Education

HAR – Sensors Utilized RGB-Camera MoCap System Depth Camera Wearable
(MEMS) Inertial Sensors

HAR – Comparison of Sensors +Can capture visual cues (color,
shape, motion) +Widely available - Illumination affects performance -Occlusions affects performance -Subject needs to be in field of view - No depth information +Very accurate -Expensive -Need a lot of space RGB-Camera MoCap System

+Can capture 3D information +Can work under low light +Availability
of skeletal joint positions (Microsoft Kinect) -Unrealistic skeletal joint Positions -Object Should be in field of view -Privacy concerns HAR – Comparison of Sensors Depth Camera

+Can Capture 3D information +Can work under low light +Availability
of skeletal joint positions (Microsoft Kinect) -Unrealistic skeletal joint positions -Object should be in field of view -Privacy concerns HAR – Comparison of Sensors -Limit to the number of sensors worn -Unwillingness to wear a sensor +Easy to wear +Provide little or no hindrance +Very accurate with high sampling rate Depth Camera Wearable (MEMS) Inertial Sensors

HAR – Sensor Fusion Why sensor Fusion? The limitations of
one sensor can be compensated by other sensor(s)

Problem Definition 11 Original Depth sensor Skeletal joint positions Inertial
sensor measurements

Performed Action Data Acquisition Recognize Human Actions by Assigning them
a Class Label Feature Classification Class Label Eating Knock Squat Smoking Problem Statement

Problem Definition 14 Assign a class label Depth Sensor: Inertial
Sensor:

Proposed Solution 16

Dividing Rows by their norm Savitzky-Golay Filter Bi-cubic Interpolation Savitzky-Golay
Filter Features Stacked Column- wise Feature Extraction - Skeletal Data

Temporal windows of size W x 6 µ and σ
from windows (per direction) Bi-cubic Interpolation Features Stacked Column- wise Feature Extraction - Inertial Data

Proposed Solution 19 Depth Sensor: Inertial Sensor: 20 Feature Extraction

20 Feature Extraction Depth Sensor: Inertial Sensor: 0.6 Divide each
row by its norm to reduce subject and joint dependency Partition into windows (win size = 3) and calculate µ and σ for each direction Stack features column- wise Bicubic interpolation Bicubic interpolation Least number of frames from training set Stack features column-wise and use Savitzky-Golay filter [2] to reduce noise (spikes) Proposed Solution

Proposed Solution 21 - 1 hidden layer with 86 Neurons
- Trained Using Conjugate Gradient with Polak-Ribiére updates [3] - Individual Neural Network classifiers are used as the classifiers - 1 hidden layer with 90 Neurons - Trained Using Conjugate Gradient with Polak-Ribiére updates [3] Softmax output layer Feature Classification

Proposed Solution 22 Logarithmic Opinion Pool Assuming a uniform distribution
between sensors Number of sensors

24 University of Texas at Dallas Multimodal Human Action Dataset
• 1 Inertial and 1 Depth sensor - IMU to capture 3 axis linear acceleration, 3 axis angular velocity, 3 axis magnetic strength - IMU placed on right wrist for 21 actions, right thigh for 6 actions - Microsoft Kinect to track movement of 20 joints • Total size of the dataset 861 entries - 27 registered actions - 8 subjects (4 males, 4 females) - Each action performed 4 times by each subject - 3 corrupt sequences were removed Results

Results 25

26 Subject-Generic Test Skeletal Accuracy (%) Inertial Accuracy (%) Fusion
Accuracy (%) Chen et al. [5] 74.7 76.4 91.5 Implemented Algorithm 74.8 81.2 95.0 Table 1. Recognition Accuracies for subject-generic experiment - 8-fold cross-validation performed (for each subject) Comparison with state of the art implementation Results

Future Work - Try with a range of different sensors
- Test the limitation point of adding the sensors 28

Zain Fuad [email protected] Lahore Thank you zainfuad2

Paper Reference Fuad, Zain, and Mustafa Unel. "Human action recognition
using fusion of depth and inertial sensors." International Conference Image Analysis and Recognition. Springer, Cham, 2018. 30 CVR Control, Vision and Robotics Research Group

Improving Accuracy of AI Algorithms using Senso...

Improving Accuracy of AI Algorithms using Sensor Fusion (By: Zain Fuad) - DevFest 2021

GDG Lahore
PRO

More Decks by GDG Lahore

Other Decks in Technology

Featured

Transcript

Zain Fuad [email protected] Lahore Improving Accuracy of AI Algorithms Using

Contents Introduction Problem Definition Proposed Solution Results Future Work 2

➢ HAR refers to acquiring a person’s body movements and

HAR - Applications Health Care Sports Security Automation Robotics Education

HAR – Sensors Utilized RGB-Camera MoCap System Depth Camera Wearable

HAR – Comparison of Sensors +Can capture visual cues (color,

+Can capture 3D information +Can work under low light +Availability

+Can Capture 3D information +Can work under low light +Availability

HAR – Sensor Fusion Why sensor Fusion? The limitations of

Contents Introduction Problem Definition Proposed Solution Results Future Work 10

Problem Definition 11 Original Depth sensor Skeletal joint positions Inertial

12

Performed Action Data Acquisition Recognize Human Actions by Assigning them

Problem Definition 14 Assign a class label Depth Sensor: Inertial

Contents Introduction Problem Definition Proposed Solution Results Future Work 15

Proposed Solution 16

Dividing Rows by their norm Savitzky-Golay Filter Bi-cubic Interpolation Savitzky-Golay

Temporal windows of size W x 6 µ and σ

Proposed Solution 19 Depth Sensor: Inertial Sensor: 20 Feature Extraction

20 Feature Extraction Depth Sensor: Inertial Sensor: 0.6 Divide each

Proposed Solution 21 - 1 hidden layer with 86 Neurons

Proposed Solution 22 Logarithmic Opinion Pool Assuming a uniform distribution

Contents Introduction Problem Definition Proposed Solution Results Future Work 23

24 University of Texas at Dallas Multimodal Human Action Dataset

Results 25

26 Subject-Generic Test Skeletal Accuracy (%) Inertial Accuracy (%) Fusion

Contents Introduction Problem Definition Proposed Solution Results Future Work 27

Future Work - Try with a range of different sensors

Zain Fuad [email protected] Lahore Thank you zainfuad2

Paper Reference Fuad, Zain, and Mustafa Unel. "Human action recognition