Save 37% off PRO during our Black Friday Sale! »

TMPA-2021: Modern experiment management systems architecture for scientific big data

TMPA-2021: Modern experiment management systems architecture for scientific big data

Anastasiia Kaida and Aleksei Savelev
Modern experiment management systems architecture for scientific big data

TMPA is an annual International Conference on Software Testing, Machine Learning and Complex Process Analysis. The conference will focus on the application of modern methods of data science to the analysis of software quality.

To learn more about Exactpro, visit our website https://exactpro.com/

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro

5206c19df417b8876825b5561344c1a0?s=128

Exactpro
PRO

November 26, 2021
Tweet

Transcript

  1. 1 25-27 NOVEMBER SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS

    ANALYSIS Modern experiment management systems architecture for scientific big data Anastasiia Kaida, Aleksei Savelev National Research Tomsk Polytechnic University ayk13@tpu.ru
  2. 2 The problem • Modern scientific experiments have heterogeneous infrastructure

    • Experiment digital environment grows in an unpredictable way • Changes of research team aims and objectives • Building a stable and flexible dataflow • Same issues for modern projects but different solutions • Exponential growth of data • Expansion of cooperation through the creation of scientific collaborations and interdisciplinary research groups
  3. 3 Experiment management systems ZOO (1996) Brain Mapper (2000) Taverna

    (2005) Triana (2005) – Grid-based Expo (2014) Weevil (2011) OMF (2009) ATLAS DKB (2019)
  4. 4 Big Data Ecosystem • Data integration subsystems • Heterogeneous

    data sources • Whole data lifecycle support • Unknown data value before exploratory data analysis (EDA) • Rapid growth of data
  5. 5 Research background: ATLAS DKB (CERN) M.V.Golosova, V.A. Aulov, M.A.

    Grigorieva, A.Y. Kaida – Data Knowledge Base for the ATLAS collaboration
  6. 6 Research background: Web-mining and opinion-mining (TPU) • Flexible workflow

    • Multibranched working process • Interdisciplinary research team • Need of experiment cataloging Savelev et al. The High-Level Overview of Social Media Content Search Engine
  7. 7 The concept • Unified pattern: ◦ Metadata management system

    ◦ Protocol for metadata management system ◦ ETL-pipelines ◦ Knowledge base ◦ Internal and external heterogeneous data sources ◦ Message-oriented middleware • The main research is performed based on the Laboratory of Unstructured Data Processing at TPU
  8. 8 8 A principle high-level architecture of EMS

  9. 9 Future plans • Building a workflow for the Laboratory

    of Unstructured Data Processing • Making a testbed for the lab projects • Testing and deployment of the initial implementation
  10. 10 Thank You! Follow TMPA on Facebook TMPA-2021 Conference