OpenTalks.AI - Андрей Устюжанин, Предиктивная аналитика - обзор текущего состояния и что произошло важного за 2019 год

February 21, 2020

  Предиктивная аналитика: тренды, технологии и точки роста. Частный взгляд. 21 Feb, 2020, Andrey Ustyuzhanin

    Feb, 2020, Andrey Ustyuzhanin
  Quick self-intro Andrey Ustyuzhanin

    ▌ Head of LAboratory of Methods for Big Data Analysis, LAMBDA at HSE ▌ Head of Yandex School of Data Analysis team at LHCb and SHiP at CERN › Applications of Machine Learning to natural science challenges › Playground for advanced methods and technologies ▌ Co-organizer of several data science competitions (Flavours of Physics, TrackML, IDAO) ▌ Education (MLHEP, ICL, ClermonFerrand, URL Barcelona, Coursera) ▌ Core expertise: › Data analytics, simulation, generative models, complex optimization ▌ Industry predictive analytics projects with "YADRO", "MMK", "Yandex"
  Predictive analytics – how to turn data into future insights.

  5. See Emeli’s talk for examples

  Predictive Analytics Key Drivers ▌ Transition from analog to digital

    › IOT – data abundance › Dataism – mindset, developed by significance of Big Data (flows) ▌ Global AI race › AI technologies transit from 'nice to have' to 'must have' for companies and governments › Changing nature of power ▌ Sustainable solutions, service personalization › From offline to real-time › "What-if … " analytics, process-oriented analytics
  Big Data Problem ▌ Complexity of the system › Data

    pieces are always missing › Noise to signal ratio may get high › No single expert knowledge ▌ Process-agnostic › Data is only part of the truth › Relying on part of the past we cannot predict all future scenarios › Future could be a very special version of the past
  Simulation • add any additional rules from

    experts • build a trustworthy model of the system, including those rough edges that your data might miss • build and verify models with historical data • replay it with slightly different conditions and random variations • take into account unexpected interactions (think butterfly effect)
  Simulation toolkits ▌ Anylogic ▌ NetLogo ▌ Flexsim ▌ Simio,

    ▌ Simul8 ▌ Arena ▌ Salabim ▌
  Simulation for industry – Digital Twins Reduced CO emission by

    factor of 3.5 by CompMechLab, DOI 10.1109/ACCESS.2018.2890566 Optimisation of: ▌ Design ▌ Logistic ▌ Supply chain ▌ New materials Testing of ▌ Maintenance ops ▌ Anomaly ▌ "What-if" scenarios
  Simulation in Science (Particle Physics) ▌ Toolkits: › Pythia ›

    GEANT4 ▌ Applications: › Rare events simulation › Background process › Tuning of software › Design of hardware
  Interesting Questions to Explore ▌ Simulation speed-up by either solving

    ODEs or approximating routine simulator calls by Neural Nets, arXiv:1812.01319v2 ▌ Transfer Learning for Machine Fault Diagnosis, , ▌ Optimisation of computationally expensive hardware design ▌ Tuning of heavy simulators to match historical data Simulation of realistic anomalies - ▌ Fast simulation of physics process by neural networks
  Key technologies towards Prescriptive Analytics › Simulation, Simulation tuning, Simulation

    speed-up › Transfer Learning (from another domain, from prior knowledge) › NLP (Process Mining, Network Analysis) › Extrapolating models, causality › Interpretability, uncertainty modelling › Few-shot learning (see talk of Sergey Bartunov)
  Conclusion ▌ Predictive Analytics (PA) is a multifaceted and ubiquitous

    technology ▌ Data-driven PA is not enough ▌ PA is going to expand/adapt a variety of new aspiring technologies ▌ … while pushing AI quite far: › Man-in-the-loop learning › Advanced simulation › AI "scientist" › Few-shot learning ▌ Scientific collaborations (e.g. with CERN, SKA) serve as a great testbed for future industry cases anaderiRu@twitter