Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SPARQL-to-SQL on Internet of Things Databases a...

Eugene Siow
September 28, 2017

SPARQL-to-SQL on Internet of Things Databases and Streams

To realise a semantic Web of Things, the challenge of achieving efficient Resource Description Format (RDF) storage and SPARQL query performance on Internet of Things (IoT) devices with limited resources has to be addressed. State-of-the-art SPARQL-to-SQL engines have been shown to outperform RDF stores on some benchmarks. In this paper, we describe an optimisation to the SPARQL-to-SQL approach, based on a study of time-series IoT data structures, that employs metadata abstraction and efficient translation by reusing existing SPARQL engines to produce Linked Data `just-in-time'. We evaluate our approach against RDF stores, state-of-the-art SPARQL-to-SQL engines and streaming SPARQL engines, in the context of IoT data and scenarios. We show that storage efficiency, with succinct row storage, and query performance can be improved from 2 times to 3 orders of magnitude.

Presentation at International Semantic Web Conference 2016

Eugene Siow

September 28, 2017
Tweet

More Decks by Eugene Siow

Other Decks in Programming

Transcript

  1. Sensor DATA FROM MULTIPLE SOURCES “The Internet of Things is

    currently beset by product silos.” W3C Web of Things Interest Group CURRENT STATE OF THE INTERNET OF THINGS PRODUCT & DATA SILOS DEPENDENCY ON THE CLOUD LIMITED INTEROPERABILITY OF APPS
  2. INTRODUCING LINKED DATA FOR INTEROPERABILITY URI and ontologies Establish common

    data structures & References ENABLES RICH METADATA what, where, WHEN, HOW of DATA Integrates with Linked open data cyber, social, physical LOD on web Barnaghi, P., Wang, W.: Semantics for the Internet of Things: early progress and back to the future. International Journal on Semantic Web and Information Systems (2012) http://thing.io/1 is a http://ont/weather_sensor CLASS produces http://thing.io/obs/1 http://ont/temp_observation is a 13.0 has value CLASS ℃ unit located at http://thing.io/loc/1 latitude longitude -1.41 50.9
  3. LINKED DATA ON DISTRIBUTED LIGHTWEIGHT COMPUTERS Tiered levels of applications

    between the “Ground” and “Cloud” for Meteorological Data Irrigation Application Query on Rainfall Meteorological Station on Lightweight Computer Data Stream Environmental Sensors GROUND Weather Data State Inclement Weather Planning Application SRBench: National Weather and Disaster Monitoring CLOUD Broadcast Queries
  4. PERFORMANCE STORES DON’T SCALE & PERFORM WELL ON WEB YET

    Buil-Aranda, C., Hogan, A.: SPARQL Web-Querying Infrastructure: Ready for Action? ISWC 2013 CHALLENGES RESOURCE CONSTRAINED DEVICES ~500mhz CPU, 512mb ram, SD CARD historical and streaming data Need storage and query efficiency ON BOTH
  5. THE SHAPE OF IOT TIME-SERIES DATA { timestamp : 1467673132,

    temperature : { max: 22.0, min: 15.0, current: 17.0, error: { percentage: 5.0 } } } FLAT { timestamp : 1467673132, temperature : 32.0, wind_speed : 10.5, pressure : 1016 } COMPLEX 20k UNIQUE DEVICES dweet.io 99.5% FLAT SCHEMATA 0.5% COMPLEX SCHEMATA 1 2,3 4 5 6+ Width { timestamp : 1467673132, temperature : 32.0, humidity : 10.5, pressure : 1016, light: 120.0, } 1 2 3 4
  6. EFFICIENT QUERIES WITH TIME-SERIES DATA THING TEMPERATURE OBS HUMIDITY OBS

    WIND SPEED OBS 13.0 2016-01-01 06:00:00 CELCIUS 93.0 2016-01-01 06:00:00 PERCENT 10.5 2016-01-01 06:00:00 MPH LOCATION produces produces located produces has value unit time RDF GRAPH
  7. OBSERVATION DATA OBSERVATION METADATA SENSOR METADATA THING TEMPERATURE OBS HUMIDITY

    OBS WIND SPEED OBS 13.0 LOCATION produces produces located produces has value THING THING THING TEMPERATURE OBS time TEMPERATURE OBS 2016-01-01 06:00:00 unit TEMPERATURE OBS celcius 93.0 has value HUMIDITY OBS time HUMIDITY OBS 2016-01-01 06:00:00 unit HUMIDITY OBS PERCENT 10.5 has value WIND SPEED OBS time WIND SPEED OBS 2016-01-01 06:00:00 unit WIND SPEED OBS MPH EFFICIENT QUERIES WITH TIME-SERIES DATA RDF TRIPLES
  8. OUR APPROACH EFFICIENT QUERIES WITH TIME-SERIES DATA THING TEMPERATURE OBS

    WIND SPEED OBS CELCIUS PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WIND SPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00
  9. SENSOR METADATA OBSERVATION DATA OUR APPROACH EFFICIENT QUERIES WITH TIME-SERIES

    DATA THING TEMPERATURE OBS WIND SPEED OBS CELCIUS PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WIND SPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00 OBSERVATION METADATA
  10. DESIGNING OUR ENGINE THING TEMPERATURE OBS WIND SPEED OBS CELCIUS

    PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WINDSPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00 Table1 TABLE1.TEMPERATURE has value has value TABLE1.HUMIDITY has value TABLE1.WINDSPEED
  11. DESIGNING OUR ENGINE THING TEMPERATURE OBS WIND SPEED OBS CELCIUS

    PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WINDSPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00 Table1 TABLE1.TEMPERATURE has value has value TABLE1.HUMIDITY has value TABLE1.WINDSPEED
  12. DESIGNING OUR ENGINE THING TEMPERATURE OBS CELCIUS PERCENT produces loc

    HUMIDITY OBS unit TEMPERATURE HUMID 13.0 93.0 TIME 2016-01-01 06:00:00 TABLE1.TEMPERATURE has value has va TABLE1.H MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } (max ( )) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom BGP
  13. DESIGNING OUR ENGINE TEMPERATURE OBS CELCIUS TEMPERATURE 13.0 TABLE1.TEMPERATURE has

    value MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } (max ( )) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom BGP
  14. DESIGNING OUR ENGINE MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS

    a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } MAX( ) ?TEMPERATURE ?OBS ?TEMPERATURE ?uom TABLE1.TEMPERATURE CELCIUS NODE_TEMP (max ( )) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom BGP
  15. DESIGNING OUR ENGINE MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS

    a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } MAX( ) TEMPERATURE (max ( )) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom BGP FROM TABLE1
  16. SPARQL DESIGNING OUR ENGINE MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE

    OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } SQL SELECT MAX( ) TEMPERATURE FROM TABLE1
  17. LOW VISIBILITY STATIONS EXAMPLE ?OBS VISIBILITY OBS a has value

    ?OBS ?VISIBILITY has unit ?OBS ?uom } { FILTER( ,<10 ) ?VISIBILITY UNION ?OBS RAINFALL OBS a has value ?OBS ?RAINFALL has unit ?OBS ?uom } { FILTER( ,>30 ) ?RAINFALL ?SENSOR SELECT { } UNION ?OBS SNOWFALL OBS a has value ?OBS ?SNOWFALL has unit ?OBS ?uom } { ?OBS SNOWFALL OBS a has value ?OBS ?SNOWFALL has unit ?OBS ?uom BGP UNION UNION FILTER( ,<10 ) ?VISIBILITY FILTER( ,>30 ) ?RAINFALL ?OBS VISIBILITY OBS a has value ?OBS ?VISIBILITY has unit ?OBS ?uom BGP ?OBS RAINFALL OBS a has value ?OBS ?RAINFALL has unit ?OBS ?uom BGP FILTER( ,TRUE) ?SNOWFALL
  18. LOW VISIBILITY STATIONS EXAMPLE ?OBS VISIBILITY OBS a has value

    ?OBS ?VISIBILITY has unit ?OBS ?uom } { FILTER( ,<10 ) ?VISIBILITY UNION ?OBS RAINFALL OBS a has value ?OBS ?RAINFALL has unit ?OBS ?uom } { FILTER( ,>30 ) ?RAINFALL ?SENSOR SELECT } UNION ?OBS SNOWFALL OBS a has value ?OBS ?SNOWFALL has unit ?OBS ?uom } { UNION UNION FILTER( ,<10 ) ?VISIBILITY FILTER( ,>30 ) ?RAINFALL ?OBS ?snowfall ?uom TABLE1.snowfall BOOL NODE_SNOW ?OBS ?VISIBILITY ?uom TABLE1.VISIBILITY cm NODE_vis ?OBS ?rainfall ?uom TABLE1.rainfall cm NODE_RAIN { FILTER( ,TRUE) ?SNOWFALL
  19. LOW VISIBILITY STATIONS EXAMPLE ?OBS VISIBILITY OBS a has value

    ?OBS ?VISIBILITY has unit ?OBS ?uom } { FILTER( ,<10 ) ?VISIBILITY UNION ?OBS RAINFALL OBS a has value ?OBS ?RAINFALL has unit ?OBS ?uom } { FILTER( ,>30 ) ?RAINFALL ?SENSOR SELECT { } UNION ?OBS SNOWFALL OBS a has value ?OBS ?SNOWFALL has unit ?OBS ?uom } { UNION UNION FILTER( ,<10 ) VISIBILITY FILTER( ,>30 ) RAINFALL FROM TABLE1 FROM TABLE1 FROM TABLE1 FILTER( ,TRUE) SNOWFALL
  20. LOW VISIBILITY STATIONS EXAMPLE ?OBS VISIBILITY OBS a has value

    ?OBS ?VISIBILITY has unit ?OBS ?uom } { FILTER( ,<10 ) ?VISIBILITY UNION ?OBS RAINFALL OBS a has value ?OBS ?RAINFALL has unit ?OBS ?uom } { FILTER( ,>30 ) ?RAINFALL ?SENSOR SELECT { } UNION ?OBS SNOWFALL OBS a has value ?OBS ?SNOWFALL has unit ?OBS ?uom } { UNION FROM TABLE1 WHERE RAINFALL>30 SELECT SENSOR FROM TABLE1 WHERE VISIBILITY<10 UNION SELECT SENSOR FROM TABLE1 FILTER( ,TRUE) SNOWFALL
  21. SQL SPARQL LOW VISIBILITY STATIONS EXAMPLE ?OBS VISIBILITY OBS a

    has value ?OBS ?VISIBILITY has unit ?OBS ?uom } { FILTER( ,<10 ) ?VISIBILITY UNION ?OBS RAINFALL OBS a has value ?OBS ?RAINFALL has unit ?OBS ?uom } { FILTER( ,>30 ) ?RAINFALL ?SENSOR SELECT { } UNION ?OBS SNOWFALL OBS a has value ?OBS ?SNOWFALL has unit ?OBS ?uom } { SELECT UNION WHERE RAINFALL>30 WHERE VISIBILITY<10 UNION SENSOR FROM SELECT SENSOR FROM TABLE1 WHERE SNOWFALL==TRUE ( ) SELECT SENSOR FROM TABLE1 SELECT SENSOR FROM TABLE1
  22. HOW MAPPINGS WORK SPARQL2SQL Mapping S2SML MAPPING TABLE1.TEMPERATURE has value

    THING TEMPERATURE OBS WIND SPEED OBS CELCIUS PERCENT MPH LOCATION produces located HUMIDITY OBS unit TABLE1.humidity has value TABLE1.WINDSPEED has value ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom BGP SPARQL ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom SELECT * WHERE { } JENA SESAME ?? INTERFACE
  23. S2SML MAPPING WRITTEN IN RDF REFLECTS THE RDF IT REPRESENTS

    DIRECTLY SUPPORTS COLLAPSING NODES EFFICIENT QUERIES WITH IOT TIME-SERIES DATA R2rML COMPATIBLE CAN BE TRANSLATED TO AND FROM R2RML SUBJECT object predicate IRImap IRI BNODE IRI FNODE IRImap IRI BNODE FNODE LITERAL LITERALMAP
  24. S2SML MAPPING Faux NODE HUMIDITY OBS TABLE1.COL1 hasValue LITERALMAP <http://knoesis.wright.edu/ssw/obs/{readings.uuid}>

    if (HUMIDITY OBS is projected): if(readings.uuid !exists): create_col(readings.uuid) return IRImap(readings.uuid)
  25. RDF STREAM PROCESSING sparql2stream Same engine and mappings but translates

    to EPL instead of SQL TRANSLATE QUERY 2 Stream Window SPARQL query specifying stream window size REGISTER QUERY 1 Stream Sockets Supports multiple platforms and streams with ZeroMQ STREAM DATA 3 Real-time analytics PIOTRe & Smart Home Freeboard demo github.com/eugenesiow/iotwo RECEIVE PUSH RESULTS 4
  26. SPARQL2stream RSP QUERY MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS

    a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } FROM Named Stream Obsstream Range 1h EPL SELECT MAX( ) TEMPERATURE FROM Weather.win:time(1 hour) Event Processing Language
  27. EVALUATION WITH BENCHMARKS SRBENCH ~20,000 Stations 100 – 300k triples

    Wind, Rainfall, etc. 10 SRBench Queries Zhang, Y, et al. (2012) "SRBench: a streaming RDF/SPARQL benchmark.”The 11th International Semantic Web Conference. SMART HOME BENCH Siow, E., Tiropanis, T., Hall, W. (2016). "Interoperable and Efficient: Linked Data for the Internet of Things." The 3rd International Conference on Internet Science. 3 months, 1 home ~30k triples Motion, energy, environment 4 Analytics Queries GraphDB (OWLIM) Ontop Our Approach (S2S) TDB G Morph O S M T Raspberry Pi 2 Model B+ 1GB RAM, 900MHz Quad Core ARM Cortex A7, Class 10 SD Cards
  28. STORAGE SIZE 3ook Hurricane Ike 1ook NEVADA BLIZZARD 3ok SMART

    HOME OUR APPROACH (s2S) TDB x15 x68 x112 GraphDB x9 x1352 x453
  29. Get the rainfall observed in a particular hour from all

    stations 01 02 SRBENCH QUERY RESULTS Q01 with an optional clause on unit of measure x5 S2S S TDB GraphDB Ontop Morph x3 x13 x4k x2 x4 x4 x5k
  30. 03 04 05 Detect if a hurricane has been observed

    Get the average wind speed at the stations where the air temperature is >32 Join between wind observation and temperature observation subtrees time-consuming in low resource environment (Raspberry Pi) Detect if a station is observing a blizzard x3 x6 x6 x88 x3 x3
  31. 06 07 08 Get the stations with extremely low visibility

    Detect stations that are recently broken Get the daily minimal and maximal air temperature observed by the sensor at a given location x2 x14 x4 x6 x6 x5 x2
  32. 09 10 Get the daily average wind force and direction

    observed by the sensor at a given location Get the locations where a heavy snowfall has been observed Our Approach (s2s) is shown to be faster on all queries in the Distributed Meteorological System with SRBench Join between wind force and wind direction observation subtrees is time-consuming in low resource environment (Raspberry Pi) x3 x3k x2 x7
  33. Temperature aggregated by hour on a specified day 01 02

    SMART HOME RESULTS Minimum and maximum temperature each day for a particular month S2S TDB GraphDB x7 x29 x3 x9
  34. 03 04 Energy Usage Per Room By Day Diagnose unattended

    appliances consuming energy with no motion in room Our Approach (s2s) is shown, once again, to be faster on all queries for Smart Home Analytics Involves motion and meter data (much larger set), with space-time aggregations and joins between motion and meter tables/subgraphs. Involves meter data (larger set), with space-time aggregations. x69 x13 x4
  35. STREAM PROCESSING EFFICIENCY 1 2 3 4 5 7 8

    9 10 SRBENCH 294 261 306 277k 3243k 5245 426 280k 98 Le-Phuoc, D., et al. (2011) "A native and adaptive approach for unified processing of linked streams and linked data.” The 10th International Semantic Web Conference. CQELS Performance Improvement For IoT Data Over SMART HOME 196 2 1 167 xImprovement Query
  36. STREAM PROCESSING SCALABILITY VELOCITY >99% <1ms latency increasing from 1

    to 1000 rows/ms VOLUME 33.5million rows, projected ~2.5 billion triples! <1ms 10-100ms 1 2 5 10 100 1000 99% 100% Rate in rows/ms Percentage Latency in ms Bands
  37. PERSONAL IOT REPOSITORY Siow, E., Tiropanis, T. and Hall, W.

    (2016) PIOTRe: Personal Internet of Things Repository: The 15th International Semantic Web Conference P&D github.com/eugenesiow/piotre sparql2stream sparql2sql github.com/eugenesiow/sparql2sql PIOTRE Apps sparql2stream sparql2sql Metadata
  38. “It's a long road, it's a long and narrow way.

    If I can't work up to you, you'll surely have to work down to me someday.” Narrow Way by Bob Dylan eugene_siow Travel Supported By: LINKED DATA FOR INTEROPERABILITY A rich model to describe things and integrate connected thing’s data OPTIMISED PERFORMANCE FOR STREAMS AND HISTORICAL DATA FROM 2 to 3 orders of magnitude performance improvement