Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Harnessing the Complexity of Mobile Network Data with Smart Monitoring

Harnessing the Complexity of Mobile Network Data with Smart Monitoring

MACSPro'2019 - Modeling and Analysis of Complex Systems and Processes, Vienna
21 - 23 March 2019

Alexander Suleikin, Peter Panfilov

Conference website http://macspro.club/

Website https://exactpro.com/
Linkedin https://www.linkedin.com/company/exactpro-systems-llc
Instagram https://www.instagram.com/exactpro/
Twitter https://twitter.com/exactpro
Facebook https://www.facebook.com/exactpro/
Youtube Channel https://www.youtube.com/c/exactprosystems

Exactpro
PRO

March 23, 2019
Tweet

More Decks by Exactpro

Other Decks in Research

Transcript

  1. Harnessing the Complexity of Cellular Network
    Data with Smart Monitoring
    Alexander Suleykin,
    Peter Panfilov
    Higher School of Economics , Moscow, 2019
    www.hse.ru

    View Slide

  2. Higher School of Economics , Moscow, 2019
    Complexity Problem: User Domain.
    Overview of mobile users worldwide
    photo
    photo
    Number of mobile users worldwide between 2010 and 2020 (in millions)

    View Slide

  3. Higher School of Economics , Moscow, 2019
    Complexity Problem: Data Communication Domain.
    High speed data streams
    photo
    photo
    • Core and Radio network high speed data
    interaction
    • Constant data streaming with zero-downtime
    infrastructure
    • For 1 million active users > than 0.25 Gb/s
    data speed for 10 core network protocols
    • Need for real-time, distributed, fault-tolerant
    and reliable data processing

    View Slide

  4. Higher School of Economics , Moscow, 2019
    Complexity Problem: Technology Domain
    Protocols and technology difference
    • Different data transfer technologies (2G, 3G,
    4G, 5G)
    • Packet switch and circuit switch data
    transmission
    • Data and metadata transmission
    • Different vendors with many solutions and
    compression/decompression data techniques
    • Result: Many protocols for Core and Radio
    network within technology

    View Slide

  5. Higher School of Economics , Moscow, 2019
    Complexity Problem: Network Infrastructure Domain
    Network Infrastructure elements complexity
    photo
    photo

    View Slide

  6. Higher School of Economics , Moscow, 2019
    Smart Monitoring System
    SMS vision relies on the use of modern ICT (digitization) to efficiently
    manage and maximize the utility of network infrastructures in order to
    improve the QoS and network performance
    In SMS projects, data from sensors monitoring the system state are
    used to drive computations that in turn can dynamically adapt the
    monitoring process as the complex system evolves.

    View Slide

  7. Higher School of Economics , Moscow, 2019
    Smart Monitoring System for Cellular Network
    The development of a SMS for CN data entails the ability to
    dynamically incorporate more accurate information for
    network controlling purposes through obtaining real-time
    measurements from the network meters, various kinds of
    sensors, base stations, and other sources, including models.

    View Slide

  8. Higher School of Economics , Moscow, 2019
    Smart Monitoring System for Cellular Network
    Highly distributed nature of data sources in the CNs dictates the
    extensive use of distributed computing infrastructures while data
    complexity requres introducing a sophisticated means of data
    dimensionality reduction, data understanding, knowledge discovery
    and decision support.
    These are now covered by Big Data R&D area.

    View Slide

  9. Big Data Methods and Techniques
    Higher School of Economics , Moscow, 2019
    Big Data for CN Monitoring
    photo
    photo
    MapReduce MPP In-Memory
    Computing
    Message
    Oriented
    Middleware
    Programming
    Languages
    Big Data Tools & Applications
    Lambda-
    Driven AF
    CN Monitoring
    Powerful hardware

    View Slide

  10. Higher School of Economics , Moscow, 2019
    Big Data in CN Operational Planning
    photo
    photo
    Why Big Data Enabled approach has not yet applied?
    • Fragmentation of department’s goals
    • The lack of experience
    • The complexity of streaming network data
    • The complexity of architectural framework
    • Infrastructure Cost
    • Unclear Use Cases

    View Slide

  11. Higher School of Economics , Moscow, 2019
    Big Data Driven Framework for CN's SMS
    How to make SMS system smarter?
    • DDDAS — The Distributed Data-Driven Application System Concept
    • Lambda — A Distributed computing architecture for Big Data apps
    • Spark — distributed computing infrastructure for machine learning

    View Slide

  12. Higher School of Economics , Moscow, 2019
    DDDAS Paradigm in Smart Modeling/Measurement
    Application System
    How to make SMS system smarter?
    The Dynamic Data Driven Application Systems (DDDAS) concept
    entails the ability to incorporate dynamically data into an executing
    application simulation, and in reverse, the ability of applications to
    dynamically steer measurement processes.
    Such dynamic data inputs can be acquired in real-time on-line or they
    can be archival data.

    View Slide

  13. Higher School of Economics , Moscow, 2019
    DDDAS Paradigm in Smart Modeling/Measurement
    Application System
    How to make SMS system smarter?
    The Dynamic Data Driven Application Systems (DDDAS) concept
    entails the ability to incorporate dynamically data into an executing
    application simulation, and in reverse, the ability of applications to
    dynamically steer measurement processes.
    Such dynamic data inputs can be acquired in real-time on-line or they
    can be archival data.

    View Slide

  14. Higher School of Economics , Moscow, 2019
    DDDAS Paradigm in Smart Modeling/Measurement
    Application System
    How to make SMS system smarter?
    The DDDAS concept offers the promise of improving modeling
    methods, augmenting the analysis and prediction capabilities of
    application simulations, improving the efficiency of
    simulations and the effectiveness of measurement systems.
    Source: Darema, F. (2004). “Dynamic Data Driven Applications Systems: A New Paradigm for Application
    Simulations and Measurements.” International Conference on Computational Science.

    View Slide

  15. Higher School of Economics , Moscow, 2019
    DDDAS R&D International Cooperative Efforts
    How to make SMS system smarter?
    Advances and technology capabilities required and enabled through the
    DDDAS concept are fostered through the DDDAS Program announced in
    2005, with seeding efforts in the area having started previously (2000 – 2005)
    through the NSF ITR Program.
    The DDDAS Program was co-sponsored by multiple Directorates and Offices
    of NSF, NOAA and NIH, and in cooperation with Programs in the European
    Community and the United Kingdom. Over 30 DDDAS-related projects were
    supported by NSF grants in 2005 competition
    See more at: http://www.dddas.org/index.html).

    View Slide

  16. Higher School of Economics , Moscow, 2019
    DDDAS R&D International Cooperative Efforts
    How to make SMS system smarter?
    DDDAS has been widely adopted in several problems such
    as supply chain systems, controlling and operation planning
    of microgrids, controlling aerospace vehicles and etc.
    Source: Fujimoto, Richard M. et al. “Dynamic data driven application systems for smart cities and urban infrastructures.” 2016 Winter
    Simulation Conference (WSC) (2016): 1143-1157

    View Slide

  17. Higher School of Economics , Moscow, 2019
    DDDAS R&D International Cooperative Efforts
    How to make SMS system smarter?
    DDDAS has been widely adopted in several problems such
    as supply chain systems, controlling and operation planning
    of microgrids, controlling aerospace vehicles and etc.
    There is an example of the Instrumented Oil-Field DDDAS
    project that has enabled a new generation of data-driven,
    interactive and dynamically adaptive strategies for
    subsurface characterization and oil reservoir management.

    View Slide

  18. Higher School of Economics , Moscow, 2019
    DDDAS in Smart Power Grids
    Self-configuring adaptive simulation. Demand plays a
    vital role in operational planning and controlling the
    power grid. DDDAS addresses the issue by feeding real
    time data from smart meters into simulation model to
    adapt the model to the changes in the real system.
    Multi agent modelling and multi objectivity. Power
    grids depend upon many interacting dynamic systems.
    Through two way communication, each customer has
    information about time varying prices and demand
    profile of the power grid. Each customer may behave
    based on its own objective characteristics. Therefore, in
    order to obtain a global optimum solution, each
    customer should be modeled as an agent in the
    simulation model.
    Modular modeling. In a simulation-based planning and
    control framework of complex systems, computational
    efficiency is necessary not to disrupt a dynamically
    changing system. An intelligent modeling techniques
    should be developed for efficiency in these simulations.
    Source: Fujimoto, Richard M. et al. “Dynamic data driven
    application systems for smart cities and urban infrastructures.” 2016
    Winter Simulation Conference (WSC) (2016): 1143-1157

    View Slide

  19. Higher School of Economics , Moscow, 2019
    DDDAS in Smart Telerobotic Surgery System
    Source: Cardullo, F.M., Lewis, H.W., III, and Panfilov, P.B. (2006). Building TelePresence Framework for Performing Robotic Surgical
    Procedures.9th Annual International Workshop on Presence (PRESENCE 2006),Cleveland,Aug.24-26,2006,106-115.

    View Slide

  20. Higher School of Economics , Moscow, 2019
    DDDAS as an Adaptive Control Application System
    In the proposed telepresence framework the real-time simulator acts
    as a predictor, providing information to the surgeon consistent with
    the no delay situation. Clearly, dynamics models both for the robot
    dynamics and organ dynamics are necessary for the simulator to
    function in this way.
    The image preprocessor portion is the essential corrector.
    The intelligent controller is designed as an invigilator.
    The total integrated surgical telerobotics system is to behave as the human
    surgeon would if there were not a performance encumbering delay.
    Source: Cardullo, F.M., Lewis, H.W., III, and Panfilov, P.B. (2006). Building TelePresence Framework for Performing Robotic Surgical
    Procedures.9th Annual International Workshop on Presence (PRESENCE 2006),Cleveland,Aug.24-26,2006,106-115.

    View Slide

  21. Higher School of Economics , Moscow, 2019
    Lambda — a Distributed Computing Architecture for Big
    Data Application Systems
    1. All data entering the system is dispatched
    to both the batch layer and the speed layer
    for processing.
    2. The batch layer has two functions:
    (i) managing the master dataset (an
    immutable, append-only set of raw data),
    and
    (ii) to pre-compute the batch views.
    3. The serving layer indexes the batch
    views so that they can be queried in low-
    latency, ad-hoc way.
    4. The speed layer compensates for the
    high latency of updates to the serving layer
    and deals with recent data only.
    5. Any incoming query can be answered by
    merging results from batch views and real-
    time views.

    View Slide

  22. Higher School of Economics , Moscow, 2019
    Lambda Architecture and Apache Spark
    Batch layer
    The batch layer precomputes results using a distributed processing
    system that can handle very large quantities of data. Output is typically
    stored in a read-only database, with updates completely replacing
    existing precomputed views.
    Apache Hadoop is the de facto standard batch-processing system
    used in most high-throughput architectures.

    View Slide

  23. Higher School of Economics , Moscow, 2019
    Software Infrastructure for Distributed Applications
    Design

    View Slide

  24. Higher School of Economics , Moscow, 2019
    Apache Spark Ecosystem for Big Data Application
    Systems

    View Slide

  25. Higher School of Economics , Moscow, 2019
    Apache Spark Ecosystem for Big Data Application
    Systems

    View Slide

  26. Higher School of Economics , Moscow, 2019
    Apache Spark Ecosystem for Big Data Application
    Systems

    View Slide

  27. Higher School of Economics , Moscow, 2019
    The Proposed Big Data Driven Smart Monitoring
    Framework for the Cellular Network Data — a Concept

    View Slide

  28. Higher School of Economics , Moscow, 2019
    The Proposed Big Data Driven SMS Framework for the
    Cellular Network Data — an Implementation

    View Slide

  29. Higher School of Economics , Moscow, 2019
    Big Data tools for CNSMS. Part 1
    photo
    1. Cellular Network (CN)
    GERAN,
    UTRAN, E-RAN
    MSC
    MME Other Nodes
    Radio Subsystem
    SGSN
    Core Subsystem
    2. Cellular Network Probes
    Data
    Aggregations
    Vendor Specific Probes
    Data
    Enrichment
    Geo-
    Positioning

    View Slide

  30. Higher School of Economics , Moscow, 2019
    Big Data tools for CNSMS. Part 2
    3. Big Data enabled real-time parsing for Cellular compressed data
    High-
    performance
    applications
    Data Parsing applications
    4. Message-Oriented Middleware
    Lambda-Driven Architectural principles
    Many data
    consumers

    View Slide

  31. Higher School of Economics , Moscow, 2019
    Big Data tools for CNSMS. Part 3
    5. Big Data Storage and queries
    Big Data Storage
    6. Other data sources
    Available data sources
    In-memory
    DBs
    Reliable
    distributed
    storage
    Big Data SQL
    Schema on
    Read
    NoSQL DBs SQL DBs
    7. Real-time and offline data models
    Reliable, high
    performance
    models
    Different model types

    View Slide

  32. Higher School of Economics , Moscow, 2019
    Big Data tools for CNSMS. Part 4
    9. Decision makers
    Radio
    Subsystem
    Decision makers
    10. External environment
    Geo-reports
    External data consumers
    Advertisement
    campaigns
    CS and PS Core Marketing
    Revenue
    Assurance

    View Slide

  33. Higher School of Economics , Moscow, 2019
    Practical Use Case
    photo
    photo
    • Roaming users near real-time analysis
    • 3G and 4G networks, Map and Diameter
    protocols respectively
    • Streaming data
    • The goal is to filter only users with specific values
    of VLRs and HLRs that we know that these users
    have left the country
    • After filtering data are available in MoM layer of
    framework for as many data consumers as
    needed
    Project Description

    View Slide

  34. Higher School of Economics , Moscow, 2019
    Real-time model for roaming users detection example.
    Data description – GsmMap protocol
    photo
    Field Type Description
    Start Time (secs) int Call Start Time (seconds)
    Start Time (micro secs) int Call Start Time (usecs)
    End Time (secs) int Call End Time (seconds)
    End Time (micro secs) int Call End Time (usecs)
    OPC string Originating Point Code
    DPC string Destination Point Code
    Originating TID int Originating TID
    LMSI int LMSI
    Equipment Id smallint The value is the ID of the GeoProbe that
    created the call record, unique in the system.
    Last Message
    Component
    string This field identifies component type of the last
    TCAP message.
    Call Type string This field indicates the call type that generated
    this data record. The available types are listed
    after this table.
    IMSI string IMSI
    MSISDN string MSISDN
    IMEI String This field contains the International Mobile
    Equipment Identifier. This identity uniquely the
    mobile equipment
    RAW GsmMap fields - Streaming
    Field Type Description
    End Time (secs) int Call End Time (seconds)
    Originating TID int Originating TID
    Last Message
    Component
    string This field identifies component type of the last
    TCAP message.
    Call Type string This field indicates the call type that generated
    this data record. The available types are listed
    after this table.
    IMSI string IMSI
    MSISDN string MSISDN
    IMEI String This field contains the International Mobile
    Equipment Identifier. This identity uniquely the
    mobile equipment
    Filtered GsmMap fields

    View Slide

  35. Higher School of Economics , Moscow, 2019
    Real-time model for roaming users detection example.
    Data description – Diameter protocol
    photo
    RAW Diameter fields - Streaming Filtered Diameter fields
    Field Name Type Description
    starttimesecs int Start Time of the event (seconds in UNIX time)
    starttimeusecs int Start Time of the event (micro secs)
    endtimesecs int End Time of the event (seconds in UNIX time)
    endtimeusecs int End Time of the event (micro secs)
    sourceipaddress string Source IP Address of the first message in the
    first transaction that initiated the data record
    destinationipaddress string Destination IP Address of the first message in
    the first transaction that initiated the data record
    Session Status int This fields contains the Session Record Status
    value
    equipmentid int This field contains an IP Probe ID that identifies
    the call record within the GeoProbe system
    imsi string IMSI of the mobile device
    imei string IMEI of the mobile device
    msisdn string MSISDN
    originrealm string Realm of the originator of the Diameter message
    originhost string The endpoint that originated the Diameter
    message
    Transaction type string The information about type of Diameter
    transaction, options are described below
    Field Name Type Description
    Timestamp int End Time of the event (seconds in UNIX time)
    Session Status int This fields contains the Session Record Status
    value
    imsi string IMSI of the mobile device
    imei string IMEI of the mobile device
    msisdn string MSISDN
    originrealm string Realm of the originator of the Diameter message
    Transaction type string The information about type of Diameter
    transaction, options are described below

    View Slide

  36. Higher School of Economics , Moscow, 2019
    Real-time model for roaming users detection example.
    Common Experimental setup
    photo
    photo
    Common Experimental parameters
    • Real Cellular network with about 60 000 events
    per second for both protocols
    • Spark application and YARN resource manager
    • Source and destination data app – Kafka as
    MoM layer representation
    • Hortonworks Spark installation

    View Slide

  37. Higher School of Economics , Moscow, 2019
    Real-time model for roaming users detection example.
    Spark and Yarn Experimental setup
    photo
    photo
    Spark and Yarn Experimental parameters
    • 3 nodes for YARN allocated with 918 GB
    memory;
    • Percentage of physical CPU allocated for all
    containers on a node is 80%;
    • 1 second interval between jobs;

    View Slide

  38. Higher School of Economics , Moscow, 2019
    Real-time model for roaming users detection example.
    Results
    photo
    photo
    • Average Scheduling Delay is 14 ms, Average Processing Time is 464 ms and Total Delay is 478 ms.
    Result messages in Kafka - Diameter:

    View Slide

  39. Higher School of Economics , Moscow, 2019
    Experimental Results
    Average service delay (ASD) for batch processing is much longer
    than Spark delay. Apache Spark streaming runs its jobs with only
    0.015 seconds delay, while traditional batch processing has 0.5
    seconds delay;
    Average processing time (APT) shows that the same amount of
    data might be processed in 45 seconds intervals, while Spark
    streaming process data in 0.464 time intervals.
    It is achieved because Spark jobs runs each second, and the data
    processing is really fast, in-memory and efficient;
    Average interval time (AIT) between Spark jobs is 1 second,
    while interval between batch jobs is usually 1 minute.
    Batch processing cannot run faster because of overheads before
    job start. Each start of job takes some additional resources and
    needs some time to start job itself. For batch processing it is larger
    than for streaming;

    View Slide

  40. Higher School of Economics , Moscow, 2019
    Future Tasks
    • Research different use cases
    • Research and create Real-time Big Data cellular network monitoring “hub” based on Apache spark
    application
    • Research ETL Big Data tools for cellular network monitoring
    • Research other possible data sources for different models
    • Research different tools for application’s monitoring

    View Slide

  41. Higher School of Economics , Moscow, 2019
    Thank you for your attention!

    View Slide