Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Improving Accuracy of AI Algorithms using Senso...

GDG Lahore
December 18, 2021

Improving Accuracy of AI Algorithms using Sensor Fusion (By: Zain Fuad) - DevFest 2021

Talk by Zain Fuad (https://www.linkedin.com/in/zain-fuad-987a80104/) at DevFest Lahore 2021 by GDG Lahore.

GDG Lahore

December 18, 2021
Tweet

More Decks by GDG Lahore

Other Decks in Technology

Transcript

  1. ➢ HAR refers to acquiring a person’s body movements and

    understanding the performed action Human Action Recognition (HAR)
  2. HAR – Comparison of Sensors +Can capture visual cues (color,

    shape, motion) +Widely available - Illumination affects performance -Occlusions affects performance -Subject needs to be in field of view - No depth information +Very accurate -Expensive -Need a lot of space RGB-Camera MoCap System
  3. +Can capture 3D information +Can work under low light +Availability

    of skeletal joint positions (Microsoft Kinect) -Unrealistic skeletal joint Positions -Object Should be in field of view -Privacy concerns HAR – Comparison of Sensors Depth Camera
  4. +Can Capture 3D information +Can work under low light +Availability

    of skeletal joint positions (Microsoft Kinect) -Unrealistic skeletal joint positions -Object should be in field of view -Privacy concerns HAR – Comparison of Sensors -Limit to the number of sensors worn -Unwillingness to wear a sensor +Easy to wear +Provide little or no hindrance +Very accurate with high sampling rate Depth Camera Wearable (MEMS) Inertial Sensors
  5. HAR – Sensor Fusion Why sensor Fusion? The limitations of

    one sensor can be compensated by other sensor(s)
  6. 12

  7. Performed Action Data Acquisition Recognize Human Actions by Assigning them

    a Class Label Feature Classification Class Label Eating Knock Squat Smoking Problem Statement
  8. Dividing Rows by their norm Savitzky-Golay Filter Bi-cubic Interpolation Savitzky-Golay

    Filter Features Stacked Column- wise Feature Extraction - Skeletal Data
  9. Temporal windows of size W x 6 µ and σ

    from windows (per direction) Bi-cubic Interpolation Features Stacked Column- wise Feature Extraction - Inertial Data
  10. 20 Feature Extraction Depth Sensor: Inertial Sensor: 0.6 Divide each

    row by its norm to reduce subject and joint dependency Partition into windows (win size = 3) and calculate µ and σ for each direction Stack features column- wise Bicubic interpolation Bicubic interpolation Least number of frames from training set Stack features column-wise and use Savitzky-Golay filter [2] to reduce noise (spikes) Proposed Solution
  11. Proposed Solution 21 - 1 hidden layer with 86 Neurons

    - Trained Using Conjugate Gradient with Polak-Ribiére updates [3] - Individual Neural Network classifiers are used as the classifiers - 1 hidden layer with 90 Neurons - Trained Using Conjugate Gradient with Polak-Ribiére updates [3] Softmax output layer Feature Classification
  12. 24 University of Texas at Dallas Multimodal Human Action Dataset

    • 1 Inertial and 1 Depth sensor - IMU to capture 3 axis linear acceleration, 3 axis angular velocity, 3 axis magnetic strength - IMU placed on right wrist for 21 actions, right thigh for 6 actions - Microsoft Kinect to track movement of 20 joints • Total size of the dataset 861 entries - 27 registered actions - 8 subjects (4 males, 4 females) - Each action performed 4 times by each subject - 3 corrupt sequences were removed Results
  13. 26 Subject-Generic Test Skeletal Accuracy (%) Inertial Accuracy (%) Fusion

    Accuracy (%) Chen et al. [5] 74.7 76.4 91.5 Implemented Algorithm 74.8 81.2 95.0 Table 1. Recognition Accuracies for subject-generic experiment - 8-fold cross-validation performed (for each subject) Comparison with state of the art implementation Results
  14. Future Work - Try with a range of different sensors

    - Test the limitation point of adding the sensors 28
  15. Paper Reference Fuad, Zain, and Mustafa Unel. "Human action recognition

    using fusion of depth and inertial sensors." International Conference Image Analysis and Recognition. Springer, Cham, 2018. 30 CVR Control, Vision and Robotics Research Group