Slide 1

Slide 1 text

TAMING THE DATA BEAST WITH OPEN SOURCE EXPLOITING DIGITAL CONFERENCE 17/07/2015

Slide 2

Slide 2 text

•  Who is standing in front of you? •  What do we mean by data? •  What challenges are presented by data? WHAT ARE WE GOING TO DISCOVER TODAY? •  What open source tools exist for data? •  How do we handle data in systems? •  How is data handled in the real world? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 2

Slide 3

Slide 3 text

TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 3 1 WHO AM I?

Slide 4

Slide 4 text

Richard Freeman - Project Manager •  Teacher, developer, project manager since 2000 •  Website, smartphone apps and social media projects for Stella Artois, Eli Lilly and Encyclopedia Britannica WHO IS STANDING BEFORE YOU? flowmoco - web and mobile development studio based in Newquay, Cornwall •  Relatively new - formed in 2012 •  Relatively small – 13 staff •  Relative specialist - Enterprise and Open source TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 4

Slide 5

Slide 5 text

WHO DO WE WORK FOR? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 5

Slide 6

Slide 6 text

TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 6 2 WHAT DO WE MEAN BY DATA?

Slide 7

Slide 7 text

I know where you live, what you eat and who you are friends with – I KNOW EVERYTHING ABOUT EVERYONE!

Slide 8

Slide 8 text

•  Currently, English Wikipedia includes 4,915,500 articles and it averages 750 new articles per day (just not edited by MPs of course…) WHAT DO WE MEAN BY DATA? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 8

Slide 9

Slide 9 text

•  300 hours of video being added to YouTube every minute, not all featuring Gangnam Style (which broke the play counter…) WHAT DO WE MEAN BY DATA? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 9

Slide 10

Slide 10 text

•  1.25 bn MAU of Facebook providing mobile location data, images, videos and more (once Moneypenny has access to it…) WHAT DO WE MEAN BY DATA? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 10

Slide 11

Slide 11 text

•  Format: •  Video •  Text •  Image •  Metadata •  Microdata WHAT DO WE MEAN BY DATA? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 11 •  Type: •  Temporal •  Spatial •  Personal •  Relational •  Archive

Slide 12

Slide 12 text

12 3 WHAT CHALLENGES COME WITH DATA? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015

Slide 13

Slide 13 text

•  User Generated Content •  Instagram changed Terms of Use = PROTEST! •  Flickr, Facebook, YouTube •  Advertising integration •  Creative Commons Licensing models CHALLENGE 1: OWNERSHIP •  Commerically sourced •  Global / local usage rights •  CDN battles •  Copyright protection, including watermarking •  Caching of data TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 13

Slide 14

Slide 14 text

•  Local storage •  Processing and manipulation benefits •  Provides a cache for quicker access •  Reduced setup time •  Cost of provision CHALLENGE 2: STORAGE •  Cloud storage •  Global rights agreements •  CDN coverage •  Resilience and backup •  Cloud services e.g. video encoding, big data processing, TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 14

Slide 15

Slide 15 text

•  Does the quality of your data meet current and future needs? •  Responsive images •  4K video •  Photospheres CHALLENGE 3: LONGEVITY •  Is your data flexible, movable and transferable? •  Data guide - structures •  Import / export •  Cleansing •  Fuzzy logic TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 15

Slide 16

Slide 16 text

16 4 WHAT OPEN SOURCE TOOLS EXIST? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015

Slide 17

Slide 17 text

•  GULP •  CASPER •  DOCKER •  PHANTOM •  HOMEBREW •  PUPPET OPEN SOURCE TOOL O.O.O. ! •  Can you guess the Odd One Out, which isn’t an open source tool? •  Endless amounts •  Constantly developed •  Community focus TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 18

Slide 18

Slide 18 text

•  Flexible platform, easy to customise •  Easy to develop further on, modular •  Integrations possible with other systems DRUPAL AS A PLATFORM •  No vendor lock-in, Drupal open source •  Create open APIs for data sharing •  Provides web, app and future platforms TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 19

Slide 19

Slide 19 text

•  Powerful - it’s not a CMS - it’s a framework •  Flexible, headless, local •  1,000s of modules already exist DRUPAL AS A HAMMER •  If there isn’t a module – you can build one! •  Represents the combined effort of thousands of developers TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 20

Slide 20

Slide 20 text

•  Open Source - it’s more secure (well, US Dept of Defence think so)! “The continuous and broad peer- review enabled by publicly available source code supports software reliability and security efforts… DRUPAL AS A PADLOCK …through the identification and elimination of defects that might otherwise go unrecognized by a more limited core development team.”! •  “Security through obscurity just doesn’t work” TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 21

Slide 21

Slide 21 text

22 5 WHAT DO WE DO WITH DATA? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015

Slide 22

Slide 22 text

•  Complex technical system, producing data •  Requirement to raise visibility •  Interact with existing systems and data DATA: VISUALISATIONS •  EBRI @ Aston University •  Nasa Buzzroom •  US Department of Information Security •  Vodafone UK HQ TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 23

Slide 23

Slide 23 text

•  Combined Heat Power Plant •  SCADA system from Siemens •  Heating, chilling, power generation from bio oil DATA: VISUALISATIONS TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 24

Slide 24

Slide 24 text

•  Create a backend system without a frontend •  Decoupling data from display •  Management, storage and manipulation are key DATA: HEADLESS •  US Pharmaceutical iPad app for sales data •  US sales tool to control sales demo tool •  UK charity funding site, with data integrations TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 25

Slide 25

Slide 25 text

•  Enabling web services to share data far and wide •  Integrating with APIs and hardware to send / receive data •  OS frameworks such as Apache Cordova DATA: MOBILE •  Plotting national traneds from beehives equipped with Arduino sensor kits •  Skullcandy Headphones, follow the skate tour around the EU TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 26

Slide 26

Slide 26 text

27 6 WHAT IS EVERYONE ELSE DOING? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015

Slide 27

Slide 27 text

•  Hackney Council smartphone app project •  Github-sourced •  Cloud-backend •  iOS and Android •  Open Streetmap OPEN SOURCE CYCLING TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 28

Slide 28

Slide 28 text

•  IWM Duxford community history project •  Image archive •  User community •  Metadata, tagging, interpretation •  Workflow and management OPEN SOURCE FLYING TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 29

Slide 29

Slide 29 text

•  RE:CAPTCHA •  Spam prevention service •  Streetmap scanning •  Book digitization •  Licensing OPEN SOURCE READING TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015 30

Slide 30

Slide 30 text

31 7 DOES ANYONE HAVE ANY QUESTIONS? TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015

Slide 31

Slide 31 text

32 ☺ THANKS! @flowmoco richard@flowmo.co TAMING THE DATA BEAST EXPLOITING DIGITAL 17/07/2015