MISC.conf 2016 - Bootstrapping a Security Research Project

Bootstrapping A Security Research Project Andrew Hay, CISO DataGravity, Inc.
[email protected] @andrewsmhay

2 About Andrew Hay • Andrew Hay – Chief Information
Security Officer (CISO) @ DataGravity • Former: – Director of Research @ OpenDNS – Chief Evangelist & Director of Research @ CloudPassage – Senior Security Analyst @ 451 Research – Sr. Security Analyst in higher education and a bank in Bermuda – Product, Program, and Engineering Manager @ Q1 Labs • Wrote some books, blog, spend more time on planes than I care to mention…

Bootstrapping & How To Start Core Research Ideas Knowing When
To Stop Presenting & Defending Your Research Introduction Summary 1 2 3 4 5 6

4 Why Me? • Managed software engineers, researchers, and security
analysts at various companies • Self-taught data scientist and R, Ruby, and Python “hacker”

6 What’s Research?

7 Pasteur’s Quadrant: Applied vs. Basic Research Pure basic research
(e.g. Niels Bohr, Josh Corman) Use-inspired basic research (e.g. Louis Pasteur, Charlie Miller and Chris Valasek) - Pure applied research (e.g. Thomas Edison, Deviant Ollam) YES NO NO YES Quest for fundamental understanding Considerations of use? Source: https://en.m.wikipedia.org/wiki/Pasteur%27s_quadr ant

8 Pasteur’s Quadrant: Applied vs. Basic Research Pure basic research
(e.g. Niels Bohr, Josh Corman) Use-inspired basic research (e.g. Louis Pasteur, Charlie Miller and Chris Valasek) - Pure applied research (e.g. Thomas Edison, Deviant Ollam) KNOWLEDGE UTILITY Source: https://en.m.wikipedia.org/wiki/Pasteur%27s_quadr ant

9 Do You Look Like a Researcher? • ”Profile” for
a Data Scientist • Most traits overlap with a security researcher • What’s missing, however is: – Curiosity – Event/action-driven – GTM time sensitivity – Need to do good (or bad) Source: http://www.marketingdistillery.com/2014/11/29/is-data-science-a-buzzword-modern-data-scientist-defined/

10 Research Pros and Cons New Research Continued Research Challenge
Research • Full control of experiment • Potential to be first Pros Cons • Time consuming • Time sensitivity • Opportunity to find new results • Data exists / familiar • Limited control over experiment • Breaking findings already known • Little time sensitivity • Opportunity to challenge or disprove existing findings • Potentially time consuming • Could result in no new/different results

11 What is the question you’re trying to answer? Do
you have the data to answer the question? If you could answer the question, could you use the answer? Should I Research Something? 1 2 3

12 Types of Questions • one that seeks to summarize
a characteristic of a set of data Descriptive • one in which you analyze the data to see if there are patterns, trends, or relationships between variables Exploratory • restatement of this proposed hypothesis as a question and would be answered by analyzing a different set of data Inferential • you are less interested in what causes an outcome, just what predicts whether an outcome will occur Predictive • asks about whether changing one factor will change another factor, on average, in a population Causal • the fundamental processes involved in or responsible for an action, reaction or other natural phenomenon Mechanistic

13 Epicycle of Data Analysis Setting Expectations Collecting information (data)
2 Revising your expectations • comparing the data to your expectations • if the expectations don’t match • or fixing the data so your data and your expectations match • continuously refine your • question, exploratory data analysis, formal models, interpretation, and communication

14 Bootstrapping Research • All you need to get started
is… Research A Question Curiosity Time Ability to Execute Asset/Device

15 A Word on Asset/Device Acquisition • Electronics • Software
• Toys Buy • From work • From friend • From neighbor Borrow • Electronics store • Kiosk • Rent-A-Center Rent • Trial • VM • SaaS/PaaS Download

16 My Favorite Places… • Full refund on anything •
Items must be returned within 90 days of purchase for a full refund • televisions • projectors • computers • cameras • camcorders • tablets • iPod / MP3 players • cellular phones • Target brand items • within one year for a refund or exchange • Unopened and in new condition • 90 days for a refund or exchange • Electronics such as no-contract cellphones, computers, cameras and gaming systems • within 30 days • Music, movies, video games and software that have been opened • cannot be returned but may be exchanged

17 My Favorite Places… • Most products • 15 days
• Cell phones and devices with a carrier contract • 14 days • Wedding registry items • 60 days • Most products may be returned within 30 days of original purchase date • Caveats • Original receipt must accompany any product to be returned / exchanged • Must be in original box with original accessories, packaging, manuals, and registration card in undamaged, clean, and brand new condition • Prepaid wireless phones are returnable ONLY if unopened

19 Core Research Ideas Network Platform Software Hardware Circuitry The
network communications involved between the device and its control/management software The UI/UX and backend software required to make the device operable The combined operating system and compute platform of the device and its hosting provider The physical hardware of the device How the schematics and hardware code make it work DIFFICULTY

20 Example Questions By Research Level Circuitry Machine code, components,
etc. Hardware Docs, schematics, etc. Platform Cloud, VM, mobile, OS, etc. Software Web app, mobile app, etc. Network Packets/flows, DNS Where does it call out to? Is it encrypted? Does the software have any flaws? Plaintext data available? Is the way it’s hosted make it vulnerable? Useable information provided by direct hardware I/O? Known (or unknown) issues with machine code or base components?

21 The Research Team Hardware Expertise Network Expertise Software Expertise

22 The Research Team Hardware Expertise Network Expertise Software Expertise

23 The Research “Team”? Hardware Expertise Network Expertise Software Expertise

24 What If I’m Missing $TYPE Expertise? • Just because
you’re missing hardware, software, or network knowledge doesn’t mean you’re stuck • Make due with what you have – “My/our research focusses on the network and software capabilities of ACME webcams” • Invite other researchers to contribute to your body of work with hardware/software/network knowledge

25 Sample IoT Device Research Workflow Firmware Software download site
IoT device Mobile app device Analysis of firmware, app code, etc. Software Analysis station

26 Sample IoT Device Research Workflow Hub or switch with
span port External router Internal router Storage for pcaps, logs, etc. Analysis station Networking

27 Sample IoT Device Research Workflow IoT camera Networking

28 Sample IoT Device Research Workflow Cloud hosting infrastructure Capture
all network traffic • IP, domains, URLs • Ports, protocols Networking Bidirectional device communications

29 Sample IoT Device Research Workflow Analysis station Cloud infrastructure
UI, API, open/vulnerable ports, etc. Software Device UI, API, open/ vulnerable ports, etc.

30 Sample IoT Device Research Workflow Device firmware Mobile app
device Hardware Analysis station

31 Suggested Lab Equipment and Software Networking • Hub, tap,
or switch with span port • 2 x WiFi routers • RJ45 cables • Packet capture (tcpdump, Wireshark, etc.) Software • Debugger (GDB, WinDbg, IDA Pro, OllyDbg, Immunity) • Fuzzers (w3af, Wfuzz, Peach Fuzzer, etc.) • Vuln and port scanners (Nessus, nmap, ssl_connect, telnet, etc.) Hardware • Oscilloscope, logic analyzer, USB-to-serial adapter, JTAGulator • Software defined radio, SmartRF protocol packet sniffer • Other HW/SW mentioned here: •https://www.blackhat.com/docs/webcast/04232014-tools-of-the-hardware-hacking-trade.pdf

33 Knowing When To Stop • There are four main
reasons to stop your research – Artificial or organic time constraints • e.g. conference deadline or marketing campaign launch – Diminished relevancy of research • e.g. hypothesis proven/disproved by someone else – Success • You have proven your hypothesis – Failure • You have disproved your hypothesis STOP Time Constraint Diminished Relevancy Success Failure

34 What Does Success Look Like? • Defining success is
a crucial part of any research project • Examples of successful research outcomes include: – New knowledge is created – Decisions or policies are made based on the outcome of the experiment – A report, presentation, or application/script with impact is created – It is learned that the data can't answer the question being asked of it

35 What Does Failure Look Like? • Not every research
project is a success • Don’t look at a negative outcome as failure – It’s just validation that your experiment resulted in a negative result • Some negative outcomes include: – Decisions being made that disregard clear evidence from the data – Equivocal results that do not shed light in one direction or another – Uncertainty prevents new knowledge from being created

37 Research Pipeline Model Source: “Computational and Policy Tools for
Reproducible Research” - Roger D. Peng, PhD

38 Presenting Your Research: Print • Blogs – Effective way
of sharing research with the masses in bite sized chunks • Tip: create a series of posts to keep readers coming back • Articles – Publish your findings in a popular print or online publication • Tip: offer an exclusive to a trusted reporter ahead of announcing via a blog post

39 Presenting Your Research: Print • Academic paper – Effective
way of sharing research with academically inclined peers – Required for some conference submissions • Whitepaper – Effective way of sharing research with customers – Also a consumable medium for those who want more details

40 Presenting Your Research: Print • Hierarchy of information: Academic
paper – Title / Author list – Abstract – Body / Results – Supplementary Materials / the gory details – Code / Data / really gory details Source: https://github.com/DataScienceSpecialization/courses/blob/master/05_ReproducibleResearch/LevelsOfDetail/index.md

41 Presenting Your Research: Visualizations • Charts – Excel, Google
Sheets, Numbers for Mac – RAW - http://app.raw.densitydesign.org/ • Interactive – Tableau Public - https://public.tableau.com/ – OpenGraphiti - http://www.opengraphiti.com/ • Graphics – Pictures • e.g. device tested, concepts, you working on the research, etc. – Infographics are good for communicating big picture concepts

42 Presenting Your Research: In Person • Conference talks –
Very common (and important) in the security community – Allows audience question your research and methods • Could be stressful • Be open, but always point back to the data • Press interviews – Know your audience • Tech journalist vs. “civilian” journalist – Remember: #SBAC • Sound bytes and charts

43 Presenting Your Research: Code • Code – Python, R,
Ruby…whatever you want – Jupytr Notebook - http://jupyter.org/ • Revision control – Github - https://github.com/ • Publishing code allows others to reproduce and challenge your findings • If possible, publish your source data along with the code

44 Presenting Your Research: Code

45 Defending Your Research • Often, your findings get challenged
• Challengers may include – disturbers – heads – ty grandstanders – Friends, colleagues, and peers • Tips: – Argue with your data and your findings – Publish your code, process, and data (if allowed) – Ask for alternative findings based on the presented data • Some people are jealous – happens

46 Defending Your Research • Roger Peng provides a fantastic
checklist that can help you identify if your research is both reproducible and defensible • Ask yourself the following questions Are we doing good science? Have we documented our software environment? Was any part of this analysis done by hand? • If so, are those parts precisely documented? • Does the documentation match reality? How far back in the analysis pipeline can we go before our results are no longer (automatically) reproducible? Have we taught a computer to do as much as possible (i.e. coded)? Have we saved any output that we cannot reconstruct from original data + code? Are we using a version control system?

47 A Word About Responsible Disclosure • Responsible disclosure of
findings should be a higher priority than publishing your findings • Contact the company that has their brand attached to the device – Email, Twitter, Phone • Do not get frustrated if you do not get a response – Or if you get a negative response – Document the effort in your research • Use your judgement on time-to-publish – Consult with legal representation if you’re not sure

49 Summary • Anyone can research something – It’s whether
or not you do it well that makes you a researcher • Establish your own process and cadence for research – This will help you reproduce your methodology for the future • Disclose responsibly • Haters gonna hate – You will have to defend your research so be prepared to stand by your data

50 Further Reading: Courses • Executive Data Science, Johns Hopkins
University - https://www.coursera.org/specializations/executive-data-science • Data Science, Johns Hopkins University - https://www.coursera.org/specializations/jhu-data-science – Course materials for the Data Science Specialization - https://github.com/DataScienceSpecialization/courses • Codecademy - https://www.codecademy.com/

51 Further Reading: Books

52 Further Reading: Blogs • Data Driven Security – http://datadrivensecurity.info/blog/
• Bob Rudis – https://rud.is/b/ • My blog – http://www.andrewhay.ca • OpenDNS Security Labs – https://labs.opendns.com/blog/

Questions? Bootstrapping A Security Research Project Andrew Hay, CISO DataGravity,
Inc. [email protected] @andrewsmhay

MISC.conf 2016 - Bootstrapping a Security Resea...

MISC.conf 2016 - Bootstrapping a Security Research Project

More Decks by Andrew Hay

Other Decks in Technology

Featured

Transcript