N.Yoshimura, J. Morales, T. Maekawa, T.Hara (Osaka University, Japan)
[PerCom2024, March 13]
OpenPack:
A Large-scale Dataset for Recognizing Packaging Works
in IoT-enabled Logistic Environments
Slide 2
Slide 2 text
OpenPack Dataset
Background
CPS / Digital Twin Manager
Factory
ML ML’
Worker’s Activity Data
IoT
Device
Sensor
Readings
In Smart Factories
● Many sensors and IoT devices are installed.
● Some works mainly depend on human workers.
Supportive techniques to track
worker activities are required.
⇒ Work Activity Recognition
2
Slide 3
Slide 3 text
OpenPack Dataset
Challenges of Work Activity Recognition Dataset
● Most of the public dataset focuses on the ADL (activity of daily living).
● Public datasets in industrial domains containing complex activities is limited.
Lack of datasets for industrial domains
● Datasets for manual task provide only vision-related modalities.
Limited Modality
Lack of Io Data & Metadata
● A lot of information related to the work activity is available in the system in the factory.
○ e.g.) Order Management System → Items to be packed,
● Existing activity recognition datasets do not provide these data.
3
Slide 4
Slide 4 text
OpenPack Dataset
OpenPack Dataset
● New large-scale multimodal activity recognition dataset in industrial domain.
● Designed for developing activity recognition models with IoT-enabled devices.
0100 Picking
0200 elocate Item Label
0300 Assemble Box
0400 Insert Items
0500 Close Box
0600 Attach Box Label
0700 can Label
0800 Attach hipping Label
0900 Put on Back able
1000 Fill out Order
Work Operation & Action
Period
201 emove Item Label
204 Write Check mark
4
https://youtu.be/RiZ7kVpIHwU?si=DipjAt2s0Kp8KN6q
OpenPack Dataset
Public Datasets for Manual Works
Subjects 16* 14
Work Periods 2048 (~100 /person) ~30 trial/person
Recording Length 53.5h 12.5h
Activity Class 10 / 32* 8* / 19
Modality
Depth + Keypoints
+Acc +Gyro +Ori +LiDAR
+EDA + BVP +Tmp
Keypoints
+Acc +Gyro
Metadata / Subject Available Available
ー OpenPack LARa [2020]
12
5 times / person
25h
4 / 17
Acc+Gyro+Mag
+Mic+Loc
+Object+Ambient
-
Opportunity [2010]
** work operation / action
9
(No Repetition)
10h
18
Acc+Gyro
+Mag+Ori
+HR+Temp
Available
PAMAP [2012]
Metadata / System Available No No No
IoT Data Available No (Object Sensor) No
6
Annotation Labels 20,161 / 53,286** N/A 2640 / 7153 * N/A
*Not work related classes.
* 12 subjects have
work experience.
*Locomotion/Hand
interaction occurrence
Slide 7
Slide 7 text
OpenPack Dataset
Data Collection Scenarios
● 5 sessions/subjects, 1 session = 20 periods
● The difficulties in packaging work recognition depend on the various factors.
⇒ Four scenarios are prepared to incrementally solve challenges.
# of Items 54
Work Procedure
Follow the work
instruction
as much as possible
Scenario 1
Irregulars No
Alarm Sound No
75
Able to alter
the work procedure
at subjects decision.
Scenario 2
No
No
75
Able to alter
the work procedure
at subjects decision.
Scenario 3
Yes
No
75
Able to alter
the work procedure
at subjects decision.
Scenario 4
Yes
Yes
Ideal Wild
7
OpenPack Dataset
Metadata
Two types of metadata are available.
ubject-related Metadata
Order Sheet
Order-related Metadata
1
2
● Work Experience
● Dominant Hand
● Gender, Age
Online order management system can provide information
of a set of items to pack in the order.
Item List
Subject Data
9
Metadata can be used for recognition.
⇒ Useful for estimating
the compositions of the work.
(e.g. # of items = repetition of “remove labels” action.)
Slide 10
Slide 10 text
OpenPack Dataset
IoT Data
Advantage
● Strong connection between devices in
use and worker’s activity.
⇒ High-confidence source
IoT Data
can
Item Label
can
Printer
Acc
can Label
Operation
Disadvantage
● Data is generated only when a worker
operates the devices. ⇒ Sparse data
Existing sensor fusion techniques assume normal sensor data such as acceleration.
⇒ Method to make the best use of this high-confidence but sparse data source can
enhance work activity recognition performance.
10
Slide 11
Slide 11 text
OpenPack Dataset
Analysis: Factors Impacting Processing Time
Length of Work Operation
● The # actions performed in one period differs
for each order.
○ “Relocate item label” ⇒ 1+ action/item
○ “Attach Box label” ⇒ 1 action/period
# of Item, Box Size
● The more items, the longer it takes.
● The bigger the box, the longer it takes.
OpenPack contains a huge variation of data!
11
Slide 12
Slide 12 text
OpenPack Dataset
Benchmark
● Task: 10 work operation recognition at 1 Hz
● Input: Acceleration data from left wrist (window size = 60s)
● Metrics: F1-measure (macro average)
(Note: Protocols differs for each benchmark scenario.)
Evaluation Protocol
Baseline Models
● CNN [F. Ordonez 2016]
● U-Net [Y. Zhang 2019] … CNN-based segmentation model
● DeepConvLSTM (DCL) [F. Ordonez 2016] … CNN + LSTM
● DCL + Self-attention [S.P. Singh 2021] … Self-attention
● ConformerHAR [ Y.-W. Kim 2022] … Transformer-based model
● LOS-Net(-R) [N. Yoshimura 2022] … Designed for manual work recognition model
12
Slide 13
Slide 13 text
OpenPack Dataset
Benchmark 1: Data-rich Setting
Activity recognition with the enough
amount of training data.
● Objective: Confirm the upper bounds of
recognition performance.
● Protocol: Leave-one-subject-out CV
Results
● LOS-Net(-R) achieved 0.83 in Scenario 1.
● Scores for Scenario 4 (Rushed) was lower than others.
⇒ Models are not speed invariant.
● Contents of order have impact on performance.
⇒ Models are not robust to changing orders.
Development of model that is robust to the work speed and orders are necessary.
13
Slide 14
Slide 14 text
OpenPack Dataset
Benchmark 2: Data-scarce Setting
Activity recognition with the limited amount of training data
● Objective: More realistic than Scenario.1
● Training: Data from the 3rd session only (= 20 periods ~= 5h annotation.)
● Test: Data from the remaining session.
Results
● Benchmark 1 ⇒ Benchmark 2 @ Scenario 1
○ LOS-Net(-R): 0.83 ⇒ 0.67 (- 0.16pt↓)
○ Conformer : 0.78 ⇒ 0.53 (- 0.24pt↓)
● CNN outperformed U-Net.
● DCL outperformed DCL with Self-attention.
14
Technique to train the SOTA models with the limited training data is necessary.
Slide 15
Slide 15 text
OpenPack Dataset
Research Directions with OpenPack Dataset
● Metadata-aided activity recognition
● Fusion with high-confidence data source.
● Speed-invariant activity recognition
Others Topics
● Transfer learning across subjects, across modalities.
● Skill assessment using sensor data and metadata
● Counting the number of necessary actions / packed items.
● Estimating worker’s level of fatigue using sensor and physiological data.
● Detecting mistakes and accidents in the work process.
Metadata
IoT
15
Metadata IoT 4 Scenarios
16 Subjects 9 Moldaity
16 Subjects Metadata
Action Label
EDA+BVP
Action Label+
Related Freatures
Related Freatures
Slide 16
Slide 16 text
OpenPack Dataset
Summary
● The largest multimodal work activity dataset of packaging work.
○ 53H+ Recording, 16 Subjects, IoT + Metadata
○ 20K Work Operation Labels, 53K Action Labels
● For more information ⇒ Visit our Website !
● Check sample data ⇒ GitHub
● Try it out ⇒ Preprocessed Data at Zenodo.
OpenPack Dataset
Dataset is Available Now!
16
Preprocessed Data
Label
(Work Operation) IMU (Acc)
Website