Slide 1

Slide 1 text

INTEROPERABLE & EFFICIENT EUGENE SIOW THANASSIS TIROPANIS WENDY HALL LINKED DATA FOR THE INTERNET OF THINGS

Slide 2

Slide 2 text

“NOTHING IS WORKING.” “UNPLUG WHAT?” “EVERYTHING IS INSIDE THE WALLS.”

Slide 3

Slide 3 text

DEVICES & DATA: SECURITY, PRIVACY, LOCALITY? “The Internet of Things is currently beset by product silos.” W3C Web of Things Interest Group CURRENT STATE OF THE INTERNET OF THINGS PRODUCT & DATA SILOS DEPENDENCY ON THE CLOUD PERFORMANCE OF APPLICATIONS & ANALYTICS

Slide 4

Slide 4 text

DATA OWNERSHIP & PRIVACY WITH LIGHTWEIGHT COMPUTERS A Smart Home Scenario implementing a Personal IoT Repository Smart Home Dashboard Personal IoT Repository Environmental Sensors Energy Meters Data Stream Energy Saving Analytics Stream & Historical Queries Motion Sensors Data ownership Own and store your data at home Less Cloud ENCRYPTION BETTER PERFORMANCE SPECIFIC POLICIES/CONTROL ONLINE/OFFLINE, TRUST, ACCESS CONTROL

Slide 5

Slide 5 text

DATA LOCALITY WITH LIGHTWEIGHT COMPUTERS A Distributed Meteorological Scenario, minimising cloud dependency for Storage and Processing Irrigation Application Soil Moisture Analytics Environmental Sensors Lightweight Computer Hub Data Stream Weather Data State Inclement Weather Planning Application National Disaster Monitoring Application Cloud

Slide 6

Slide 6 text

INTRODUCING LINKED DATA FOR INTEROPERABILITY URI and ontologies Establish common data structures & References http://thing.io/1 is a http://ont/weather_sensor CLASS produces http://thing.io/obs/1 http://ont/temp_observation is a 13.0 has value CLASS ℃ unit ENABLES RICH METADATA what, where, WHEN, HOW of DATA located at http://thing.io/loc/1 latitude longitude -1.41 50.9 PERFORMANCE CHALLENGES STORES DON’T SCALE & PERFORM WELL ON WEB YET Buil-Aranda, C., Hogan, A.: SPARQL Web-Querying Infrastructure: Ready for Action? ISWC 2013

Slide 7

Slide 7 text

THE SHAPE OF IOT TIME-SERIES DATA { timestamp : 1467673132, temperature : { max: 22.0, min: 15.0, current: 17.0, error: { percentage: 5.0 } } } FLAT { timestamp : 1467673132, temperature : 32.0, wind_speed : 10.5, pressure : 1016 } COMPLEX { timestamp : 1467673132, temperature : 32.0, wind_speed : 10.5, pressure : 1016, precipitation: 0, humidity: 93.0, } 1 2 3 4 5 WIDTH

Slide 8

Slide 8 text

THE SHAPE OF IOT TIME-SERIES DATA 20k UNIQUE DEVICES dweet.io 18.5k NON-EMPTY SCHEMATA 92.3% 18k 99.5% FLAT SCHEMATA 92 0.5% COMPLEX SCHEMATA 1 2,3 4 5 6+ Width

Slide 9

Slide 9 text

OPTIMISING FOR TIME-SERIES DATA THING TEMPERATURE OBS HUMIDITY OBS WIND SPEED OBS 13.0 2016-01-01 06:00:00 CELCIUS 93.0 2016-01-01 06:00:00 PERCENT 10.5 2016-01-01 06:00:00 MPH LOCATION produces produces located produces has value unit time RDF GRAPH

Slide 10

Slide 10 text

THING TEMPERATURE OBS HUMIDITY OBS WIND SPEED OBS 13.0 LOCATION produces produces located produces has value THING THING THING TEMPERATURE OBS time TEMPERATURE OBS 2016-01-01 06:00:00 unit TEMPERATURE OBS celcius 93.0 has value HUMIDITY OBS time HUMIDITY OBS 2016-01-01 06:00:00 unit HUMIDITY OBS PERCENT 10.5 has value WIND SPEED OBS time WIND SPEED OBS 2016-01-01 06:00:00 unit WIND SPEED OBS MPH OPTIMISING FOR TIME-SERIES DATA RDF TRIPLES

Slide 11

Slide 11 text

SHARE COLUMN HEADERS NO JOINS WITHIN ROWS ‘JUST IN TIME’ METADATA OUR APPROACH OPTIMISING FOR TIME-SERIES DATA THING TEMPERATURE OBS WIND SPEED OBS CELCIUS PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WIND SPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00

Slide 12

Slide 12 text

DESIGNING OUR ENGINE THING TEMPERATURE OBS WIND SPEED OBS CELCIUS PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WINDSPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00 Table1 TABLE1.TEMPERATURE has value has value TABLE1.HUMIDITY has value TABLE1.WINDSPEED

Slide 13

Slide 13 text

DESIGNING OUR ENGINE THING TEMPERATURE OBS WIND SPEED OBS CELCIUS PERCENT MPH LOCATION produces located HUMIDITY OBS unit TEMPERATURE HUMIDITY WINDSPEED 13.0 93.0 10.5 TIME 2016-01-01 06:00:00 Table1 TABLE1.TEMPERATURE has value has value TABLE1.HUMIDITY has value TABLE1.WINDSPEED

Slide 14

Slide 14 text

DESIGNING OUR ENGINE THING TEMPERATURE OBS CELCIUS PERCENT produces loc HUMIDITY OBS unit TEMPERATURE HUMID 13.0 93.0 TIME 2016-01-01 06:00:00 TABLE1.TEMPERATURE has value has va TABLE1.H MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } SELECT MAX( ) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom

Slide 15

Slide 15 text

DESIGNING OUR ENGINE TEMPERATURE OBS CELCIUS TEMPERATURE 13.0 TABLE1.TEMPERATURE has value MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } SELECT MAX( ) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom

Slide 16

Slide 16 text

SPARQL SQL DESIGNING OUR ENGINE MAX( ) ?TEMPERATURE SELECT ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom { } SELECT MAX( ) ?TEMPERATURE ?OBS TEMPERATURE OBS a has value ?OBS ?TEMPERATURE has unit ?OBS ?uom SELECT MAX( ) ?TEMPERATURE ?OBS ?TEMPERATURE ?uom TABLE1.TEMPERATURE CELCIUS NODE_TEMP SELECT MAX( ) TEMPERATURE FROM TABLE1

Slide 17

Slide 17 text

BENCHMARKS & IOT Scenarios Meteorological SYSTEM ~20,000 Stations 100 – 300k triples Wind, Rainfall, etc. 10 SRBench Queries ANALYTICS HUB STATION HUB STATION HUB Weather SENSORS Weather SENSORS Weather SENSORS 3 months, 1 home ~30k triples Motion, energy, env 4 Analytics Queries PERSONAL STORE Weather SENSORS Weather SENSORS DEVICES W/ SENSORS SMART HOME ANALYTICS LIGHTWEIGHT COMPUTER COMPUTER/SERVER CLUSTER DEVICE SENSOR Compute & Storage Level of Distribution github.com/eugenesiow/sparql2sql

Slide 18

Slide 18 text

STORAGE SIZE 3ook Hurricane Ike 1ook NEVADA BLIZZARD 3ok SMART HOME OUR APPROACH (s2S) NATIVE STORE (TDB) x15 x68 x112

Slide 19

Slide 19 text

Get the rainfall observed in a particular hour from all stations 01 02 SRBENCH QUERY RESULTS Q01 with an optional clause on unit of measure OUR APPROACH (S2S) NATIVE STORE (TDB) x4.6 x4

Slide 20

Slide 20 text

03 04 05 Detect if a hurricane has been observed X3.4 Get the average wind speed at the stations where the air temperature is >32 x88 Join between wind observation and temperature observation subtrees time-consuming in low resource environment (Raspberry Pi) X2.7 Detect if a station is observing a blizzard

Slide 21

Slide 21 text

06 07 08 Get the stations with extremely low visibility X6 Detect stations that are recently broken x14 X5.6 Get the daily minimal and maximal air temperature observed by the sensor at a given location

Slide 22

Slide 22 text

09 10 Get the daily average wind force and direction observed by the sensor at a given location Get the locations where a heavy snowfall has been observed x305 X7 Our Approach (s2s) is shown to be faster on all queries in the Distributed Meteorological System Join between wind force and wind direction observation subtrees is time-consuming in low resource environment (Raspberry Pi)

Slide 23

Slide 23 text

Temperature aggregated by hour on a specified day 01 02 SMART HOME QUERY RESULTS Minimum and maximum temperature each day for a particular month OUR APPROACH (S2S) NATIVE STORE (TDB) x29 x9

Slide 24

Slide 24 text

03 04 Energy Usage Per Room By Day Diagnose unattended appliances consuming energy with no motion in room x69 Our Approach (s2s) is shown, once again, to be faster on all queries for Smart Home Analytics x3.6 Involves motion and meter data (much larger set), with space-time aggregations and joins between motion and meter tables/subgraphs. Involves meter data (larger set), with space-time aggregations.

Slide 25

Slide 25 text

WHY IS OUR APPROACH FASTER THAN NATIVE RDF? FASTER AGGREGATIONS ON LESS RESOURCES CAN SPECIFCALLY BUILD INDEXES FOR FAST RANGE QUERIES EFFICIENT SQL QUERIES OPTIMISE FLAT & WIDE DATA ACCESS REDUCE JOINS BETWEEN SUBGRAPHS ON THE SAME ROW COLLAPSE INTERMEDIATE NODES REDUCE JOINS W/ BLANK OR FAUX NODES IN MAPPINGS

Slide 26

Slide 26 text

RELATED WORK Rodriguez-Muro, M., Rezk, M. (2014) Efficient SPARQL-to-SQL with R2RML mappings. Web Semantics: Science, Services and Agents on the World Wide Web 33, pp. 141–169 -ontop- morph sparql2stream Priyatna, F., Corcho, O., Sequeda, J. (2014) Formalisation and Experiences of R2RMLbased SPARQL to SQL Query Translation using Morph. Proceedings of the 23rd International Conference on World Wide Web pp. 479–489 GENERAL ONTOLOGY BASED DATA ACCESS ENGINES sparql2sql Siow, Eugene, Tiropanis, Thanassis and Hall, Wendy (2016) SPARQL-to-SQL on internet of things databases and streams. Proceedings of the 15th International Semantic Web Conference (accepted, to be published) github.com/eugenesiow/sparql2sql github.com/eugenesiow/piotre Siow, Eugene, Tiropanis, Thanassis and Hall, Wendy (2016) PIOTRe: Personal IoT Repository. Proceedings of the 15th International Semantic Web Conference P&D (accepted, to be published)

Slide 27

Slide 27 text

“Until they become conscious they will never rebel and until after they have rebelled they cannot become conscious.” DATA OWNERSHIP & DATA LOCALITY DISTRIBUTED LIGHTWEIGHT COMPUTERS FOR STORAGE AND PROCESSING IN THE IOT 1984 by George Orwell LINKED DATA FOR INTEROPERABILITY A rich model to describe things and integrate connected thing’s data NOVEL TIERED LINKED DATA STORE FROM 3 to 3 orders of magnitude performance improvement @eugene_siow