Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TMPA-2021: Modern experiment management systems...

Exactpro
November 26, 2021

TMPA-2021: Modern experiment management systems architecture for scientific big data

Anastasiia Kaida and Aleksei Savelev
Modern experiment management systems architecture for scientific big data

TMPA is an annual International Conference on Software Testing, Machine Learning and Complex Process Analysis. The conference will focus on the application of modern methods of data science to the analysis of software quality.

To learn more about Exactpro, visit our website https://exactpro.com/

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro

Exactpro

November 26, 2021
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. 1 25-27 NOVEMBER SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS

    ANALYSIS Modern experiment management systems architecture for scientific big data Anastasiia Kaida, Aleksei Savelev National Research Tomsk Polytechnic University [email protected]
  2. 2 The problem • Modern scientific experiments have heterogeneous infrastructure

    • Experiment digital environment grows in an unpredictable way • Changes of research team aims and objectives • Building a stable and flexible dataflow • Same issues for modern projects but different solutions • Exponential growth of data • Expansion of cooperation through the creation of scientific collaborations and interdisciplinary research groups
  3. 3 Experiment management systems ZOO (1996) Brain Mapper (2000) Taverna

    (2005) Triana (2005) – Grid-based Expo (2014) Weevil (2011) OMF (2009) ATLAS DKB (2019)
  4. 4 Big Data Ecosystem • Data integration subsystems • Heterogeneous

    data sources • Whole data lifecycle support • Unknown data value before exploratory data analysis (EDA) • Rapid growth of data
  5. 5 Research background: ATLAS DKB (CERN) M.V.Golosova, V.A. Aulov, M.A.

    Grigorieva, A.Y. Kaida – Data Knowledge Base for the ATLAS collaboration
  6. 6 Research background: Web-mining and opinion-mining (TPU) • Flexible workflow

    • Multibranched working process • Interdisciplinary research team • Need of experiment cataloging Savelev et al. The High-Level Overview of Social Media Content Search Engine
  7. 7 The concept • Unified pattern: ◦ Metadata management system

    ◦ Protocol for metadata management system ◦ ETL-pipelines ◦ Knowledge base ◦ Internal and external heterogeneous data sources ◦ Message-oriented middleware • The main research is performed based on the Laboratory of Unstructured Data Processing at TPU
  8. 9 Future plans • Building a workflow for the Laboratory

    of Unstructured Data Processing • Making a testbed for the lab projects • Testing and deployment of the initial implementation