analysis Preventative maintenance Supply chain logistics Capital Markets Investing with alternative data Net asset valuation Portfolio performance risk Defense & Intelligence Geospatial intelligence (GEOINT) Pattern of life analysis Battlespace information dominance Telecommunications Network reliability analysis Location-enabled services Field service tracking Utilities Smart meter analysis Grid reliability analysis Preventative maintenance Other Oil & gas well log analysis Pharmaceutical clinical trial analysis Fleet telematics analysis Logistics telematics analysis 8
Considerations: - Hundreds of API feeds conforming to GBFS specification: https://github.com/NABSA/gbfs - Each feed provides relatively small amount of info as JSON; need to pre-process before loading to OmniSci - Feeds have different TTL values; want to be respectful when pinging API endpoints
Using Azure HDInsight, we can set up a managed Apache Kafka cluster - Kafka serves several purposes: aggregating feeds into a single stream, buffer for a more consistent throughput - StreamSets is a data pipeline tool, provided as an option during HDInsight setup
With feeds aggregated to single Kafka Producer (topic), ingest to OmniSci via JDBC - OmniSci supports data streaming directly from Kafka, but using StreamSets allows for additional transformation - Using JDBC along with StreamSets also allows StreamSets to manage retries and Kafka offsets
to hour-width. Moving slider crossfilters entire dashboard Clicking on bike share location crossfilters entire dashboard With the massive compute power of GPUs, data is available to the dashboard as soon as it is ingested; no indexing or other backend operations needed
Hardware Reference Guide: https://www.omnisci.com/docs/latest/4_hardware_configuration_guide.html For best performance, OmniSci recommends having additional Premium SSD(s) attached to the GPU-enabled VM solely for OmniSci read/write The number of GPUs required depends on how much data needed to be kept “hot” for a given workload. More GPUs = More GPU RAM = More data caching