Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming position events at Port of Rotterdam

Streaming position events at Port of Rotterdam

Processing Position events at the Port Of Rotterdam.
An overview of the systems involved and basics of the technology

Pieter van der Meer

November 15, 2018
Tweet

More Decks by Pieter van der Meer

Other Decks in Technology

Transcript

  1. Introduction • Pieter van der Meer • Data engineer @

    Dataworkz
 • Projects: • Aramis connection • NextGen • K8S environment
  2. Streaming position events Why? • EOL FTP connection • Business

    requirements: • See ship movements • Counts • And many more • Government requirements on reports • For example Co2 emissions • More future proof/Adaptable
  3. The challenge Build a system that: •Can process 450 unique

    position events per second •Mixed Radar/AIS data •GPS Jitter •For nines uptime (99,99%) •Provided by two mirrored servers •Existing system •Store for a minimum of 5 years •Enrich data with details from other system •Preference for real time
  4. AIS The automatic identification system (AIS) is an automatic tracking

    system that uses transponders on ships and is used by vessel traffic services (VTS). When satellites are used to detect AIS signatures, the term Satellite-AIS (S-AIS) is used. AIS information supplements marine radar, which continues to be the primary method of collision avoidance for water transport. wikipedia
  5. AIS

  6. GPS Jitter (1) • Spoiler: It depends • Position: •

    Open sky approx: 5 m • Urban area, up to 100m • Speed • approx: < 0.006 m (from specs) • Actuals a lot worse
  7. Kalman filter In statistics and control theory, Kalman filtering, also

    known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers of its theory.
  8. Four nine’s • Utilize Apache Kafka • Partitioned topics •

    Replication • Retention • Kubernetes • Replicas
  9. Two mirrored servers • Multiple servers • Deduplication • Failover

    executed by Kubernetes • MaxSurge • MaxUnavailable • Replicas
  10. Existing system • Ended up being fairly easy • Convert

    new format to old • Transform units • Select fields • Explain differences • Some small differences • Lat/Long to Rijskdriehoek
  11. Store for 5 Years • Using a Timeseries database •

    RIAK TS • Custom keys for Queries • Scaleable • Allows for replay
  12. Enrich data • Main source, Harbour Masters • Details on:

    • Depth • Berts visited • Cargo
  13. Preference for realtime • Source to Time Series DB less

    1 second • Tracks, delay estimated average 10-20 seconds • Aggregations max 10 minutes • Emissions max. 10 mins In contrast the current system runs on a weekly basis
  14. Aggregation / API • Combine data • Attach additional data

    to cleaned track • Allow for query on: • Location, bounded box • Shipname • Continuously updated
  15. The challenge Build a system that: •Approx 450 unique position

    events per second •Mixed Radar/AIS data • GPS Jitter • Four nines uptime (99,99%) • Provided by two mirrored servers • Existing system • Enrich data with details from other system • Preference for real time • Provide common API