Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming position events at Port of Rotterdam

Streaming position events at Port of Rotterdam

Processing Position events at the Port Of Rotterdam.
An overview of the systems involved and basics of the technology

Pieter van der Meer

November 15, 2018
Tweet

More Decks by Pieter van der Meer

Other Decks in Technology

Transcript

  1. Streaming position events

    at Port of Rotterdam
    Pieter van der Meer/ [email protected]
    15 november 2018

    View Slide

  2. Introduction
    • Pieter van der Meer

    • Data engineer @ Dataworkz

    • Projects:

    • Aramis connection

    • NextGen

    • K8S environment

    View Slide

  3. Streaming position events
    Why?
    • EOL FTP connection

    • Business requirements:

    • See ship movements

    • Counts

    • And many more

    • Government requirements on reports

    • For example Co2
    emissions

    • More future proof/Adaptable

    View Slide

  4. The challenge
    Build a system that:
    •Can process 450 unique position events per second

    •Mixed Radar/AIS data

    •GPS Jitter

    •For nines uptime (99,99%)

    •Provided by two mirrored servers

    •Existing system

    •Store for a minimum of 5 years

    •Enrich data with details from other system

    •Preference for real time

    View Slide

  5. Position events
    • Two types:

    • AIS Position reports

    • Radar pings

    View Slide

  6. AIS
    The automatic identification system (AIS) is an
    automatic tracking system that uses transponders on
    ships and is used by vessel traffic services (VTS).
    When satellites are used to detect AIS signatures, the
    term Satellite-AIS (S-AIS) is used. AIS information
    supplements marine radar, which continues to be the
    primary method of collision avoidance for water
    transport.
    wikipedia

    View Slide

  7. AIS

    View Slide

  8. Radar positions Rotterdam

    View Slide

  9. GPS Jitter (1)
    • Spoiler: It depends

    • Position:

    • Open sky approx: 5 m

    • Urban area, up to 100m

    • Speed

    • approx: < 0.006 m (from specs)

    • Actuals a lot worse

    View Slide

  10. GPS Jitter (2)

    View Slide

  11. Kalman filter
    In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is
    an algorithm that uses a series of measurements observed over time, containing statistical noise and
    other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than
    those based on a single measurement alone, by estimating a joint probability distribution over the
    variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers
    of its theory.

    View Slide

  12. The Kalman filter

    View Slide

  13. The Kalman filter
    Latest prediction

    View Slide

  14. Four nine’s
    • Utilize Apache Kafka

    • Partitioned topics

    • Replication

    • Retention

    • Kubernetes

    • Replicas

    View Slide

  15. Two mirrored servers
    • Multiple servers

    • Deduplication

    • Failover executed by Kubernetes

    • MaxSurge

    • MaxUnavailable

    • Replicas

    View Slide

  16. Producer and Deduplication

    View Slide

  17. Existing system
    • Ended up being fairly easy

    • Convert new format to old

    • Transform units

    • Select fields

    • Explain differences

    • Some small differences

    • Lat/Long to Rijskdriehoek

    View Slide

  18. Store for 5 Years
    • Using a Timeseries database

    • RIAK TS

    • Custom keys for Queries

    • Scaleable

    • Allows for replay

    View Slide

  19. Enrich data
    • Main source, Harbour Masters

    • Details on:

    • Depth

    • Berts visited

    • Cargo

    View Slide

  20. Preference for realtime
    • Source to Time Series DB less 1 second

    • Tracks, delay estimated average 10-20 seconds

    • Aggregations max 10 minutes

    • Emissions max. 10 mins
    In contrast the current system runs on a weekly basis

    View Slide

  21. Aggregation / API
    • Combine data

    • Attach additional data to cleaned track

    • Allow for query on:

    • Location, bounded box

    • Shipname

    • Continuously updated

    View Slide

  22. System overview

    View Slide

  23. The challenge
    Build a system that:
    •Approx 450 unique position events per second

    •Mixed Radar/AIS data

    • GPS Jitter

    • Four nines uptime (99,99%)

    • Provided by two mirrored servers

    • Existing system

    • Enrich data with details from other system

    • Preference for real time

    • Provide common API

    View Slide

  24. Thank you.
    Pieter van der Meer
    Questions?
    /pevandermeer

    View Slide