Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Designed for Security - IoT Big Data End to End Best Practices

charles-cai
February 11, 2016

Designed for Security - IoT Big Data End to End Best Practices

IoT Tech Expo - Europe
10 -11th Feb, Olympia, London, 2016

http://www.iottechexpo.com/europe/track/data-security/

- Challenges in IoT and security, privacy
- IoT = Big Data
- Best practices of IoT Big Data Modelling
- Latest development in Guaranteed Privacy and Secure MPC (multi-party computing)

charles-cai

February 11, 2016
Tweet

More Decks by charles-cai

Other Decks in Technology

Transcript

  1. Designed for Security - IoT Big Data End to End

    Best Practices • Challenges in IoT and security, privacy • IoT = Big Data • Best practices of IoT Big Data Modelling • Latest development in Guaranteed Privacy and Secure MPC (multi-party computing) 10-11 February 2016, Olympia London #IoTTechExpo Charles Cai, Chief Architect | Senior Advisor, Finance Group | Deputy GM, IDC| Wanda
  2. Designed for Security – IoT Big Data Modelling • #FO

    #FICC: Investment Banking Front Office: FX/Commodities • #ETRM: Energy Trading & Risk Management • #entrepreneur #innovator #disruptor • Hacker / Maker: won major IoT / FinTech / BlockChain/ BigData hackathons; Organizer of high profile big data hackathons • Twitter: @caidong #big-data #data-science #IoT #cloud • LinkedIn: http://uk.linkedin.com/in/charlescai/en • Email - business [email protected] • Email - IoT: [email protected] 10-11 February 2016, Olympia London #IoTTechExpo Charles Cai
  3. “There's now a growing sense of fatalism: It's no longer

    if or when you get hacked, but the assumption is that you've already been hacked, with a focus on minimizing the damage.” Source: Dark Reading / Security’s New Reality: Assume The Worst
  4. Breaches Happen in Hours… But Go Undetected for Months or

    Even Years Source: 2013 Data Breach Investigations Report Seconds Minute s Hours Days Weeks Months Years Initial Attack to Initial Compromise 10% 75% 12% 2% 0% 1% 1% Initial Compromise to Data Exfiltration 8% 38% 14% 25% 8% 8% 0% Initial Compromise to Discovery 0% 0% 2% 13% 29% 54% 2% Discovery to Containment/ Restoration 0% 1% 9% 32% 38% 17% 4% Timespan of events by percent of breaches In 60% of breaches, data is stolen in hours 54% of breaches are not discovered for months
  5. Case Study: Continuous Patient Monitoring with Wearables + IMU +

    bio-sensors • Until now there’s no cure for Parkinson’s disease • New medicine trial is an extremely slow process – and feedbacks from the patients are not frequent at all • Wearables + IMU + BLE can easily help quicken the process • Sensor data can be collected from Microsoft Band in real-time (up to 64 measurements per second): • Movements: gyroscope, accelerometer, digital compass, … • Bio-metrics: heartbeat, skin temperature, … • To analyze typical Parkinson’s symptoms: tremor and slowed movements – Loss of automatic movements, impaired posture and balance • Along with clinical records of medicine intake • Technology used: Wearables with IMU + BLE -> IoT Gateway -> Cloud
  6. Case Study: Advanced Healthcare Machine Learning / Predictive Platform in

    the Cloud • By collecting continuous bio-sensor data, we need big storage + advanced machine learning model • Cloud based scalable time series database + Streaming Analytics + Machine Learning Stack provides a sophisticated platform to tackle the big data challenge in Healthcare • 9-axis IMU (Inertial Measurement Unit) with Gyroscope + Accelerometer + Digital Compass • Advanced sensor fusion to be developed • Classification of wearer activities • sitting, standing, walking, running, sleeping… • Detect pattern of patient symptoms • E.g.: Parkinson’s Disease: • predicting deterioration speed • new trial medicine effectiveness • 20+ other environmental + biometric sensor data • Easily 200MB – 10GB uncompressed data a day!
  7. Home Environmental Sensors Bio-sensing Wearable Devices Personal Bio-metrics Health tracking

    Speech, face and emotions recognition, interaction Home Environmental Metrics Predictive Home Care Analytics dslogix Internet of Things Reminders, tasks monitor movement tracking Doctors / Carers / MIners Family Other IOT Other Patients Intelligent Algorithms Big Data IoT Gateway w/ Connected e-Health Instruments Smart Medicine Container Smart Cane https://dsrobotix.io e-Diagnostics VR / AR Case Study: End-to-End Connected Health
  8. Low Power = Low Security! The key is IoT Gateway

    • AES Computation • BLE 4.1 • CoAP vs JSON • HTTPS / TLS • ARM Cortex – M0, *M3/M4… Some vendors: • Intel/Dell IoT Gateway • Cryptosoft IoT Security Solution • …
  9. So IoT = Big Data - Where are we at

    with Big Data Analytics? By Thomas Davenport – Harvard Business Review
  10. Open Source Data Science Toolbox Hadoop / Mesos Distributed Storage

    + Scalable Computation 16 COTS Apps (Excel, Tableau, Qlik...) Statistical Time Series Analysis Wider Big Data Analytics eco-systems • Shell/APIs: HDFS, Hive, Spark, HBase, Sqoop, JDBC/ODBC • Languages: Julia, Python, R, Scala - Developed on: - Operated by: NLTK: Natural Language Distributed Time Series / Geospatial / Graph Databases GIT Repo D a t a P r o d u c t s WebSocket Drag + Drop (CZML/GeoJSON) Web Browser (collaboration) Export to CSV/Excel Geospatial data Time Series data Public Data Market data Real-time Streaming Open Gov Data JDBC via phoenix HDFS Hive/Pig w/ Geospatial Open Source Enterprise Big Data / Data Science Platform Attack Points
  11. Introducing OpenSOC Intersection of Big Data and Security Analytics Multi

    Petabyte Storage Interactive Query Real-Time Search Scalable Stream Processing Unstructured Data Data Access Control Scalable Compute OpenSOC Real-Time Alerts Anomaly Detection Data Correlation Rules and Reports Predictive Modeling UI and Applications Big Data Platform Hadoop
  12. Simple data masking is not enough! • New York “Anonymous”

    Taxi Data Privacy Leak • Consider: • K-anonymity • l-diveristy • T-closeness • Q-disclosure • Q-precense PII (Personal Identifiable Information) needs advanced de- identification / anonymous process
  13. New Development: Secure Multi-party Computation Wikipedia Homomorphic Encryption –> Computation

    while Encrypted Multi-party Computation -> Privacy Preserved Computation
  14. Benefits • Legal requirements: e.g. • Privacy-preserving • Auditable –

    who can access / compute my data • Full + pseudo-anonymous (k-anonymity) • Monetizing opportunity for data contributors (based on computation counters) • Kill-switch: control if he/she wants to participate in the computation • Meeting HIPPA: • Ethereum VM, Esri DB, Solidity (need extension in Smart Contract for private data, a.k.a. Private Contract) + Spark Analytics Computation Engine
  15. Summary 1. Holistic security design from low powered wearables to

    On-Prem / Cloud based Big Data Analytics 2. IoT = Big Data 3. Big Data = Big Responsibility 4. Privacy Preserved Computation +