Detecting Insider Threats with Elastic

Adam Reeve, Principal Architect Russell Snyder, Principal Engineer Detecting Insider
Threat with Elastic RedOwl October 19, 2016

Introduction Adam Reeve, Principal Architect

3 { } Joseph Blankenship, Forrester Research Treating insiders as
a technology problem ignores the human aspects of motivation and behavior.

Result Action Recon Target Vulnerability Tool 4 Motivation Actor Employee,
Contractor, Manager Money, Power, Revenge, Blackmail Account Privilege, Physical Access, Admin Culture, Policy, Flaws PII, IP, Keys, Money, Plans Scan, Research, Probe Read, Copy, Destroy, Modify, Steal Reputational Damage, Financial Loss, Outage The Insider Threat Kill Chain

5 Entitled Insider | IP Thief “Recruited by a competitor.
Took client lists, product ideas, internal working documents - everything he’d ever been a part of.” Blackmailed Developer | PII Thief “Social media posts about financial troubles led by a ‘recruiter’ to contact her. Simple requests quickly escalated into blackmail.” Insiders aren’t Metadata, they’re People Disgruntled Employee | Saboteur “Huge fight with boss. Quit and deployed time-bomb corrupting our HR system, inserted false transactions in a client back-end system.”

6 RedOwl Dashboard

7 Unstructured data ▸  Many formats, many sources Some common,
some unique 100,000’s of events per day, inconsistent ordering Complex analytics ▸  Create valuable, actionable insight Diverse deployment model ▸  Public Cloud ▸  Client Private Cloud ▸  Client Bare Metal This is a Hard Problem

8 ETL Analytics Ingest Data Store Application Data Flow The
RedOwl Stack

Identifying Risky Behavior Russell Snyder, Principal Engineer

10 Reveal Analytics Service (c. 2012) ▸  Custom Hadoop workflow
▸  Heavy-weight inferential statistics engine ▸  Such cool, many novel So many issues… ▸  Resource intensive ▸  Difficult to configure/manage ▸  Interpretability of results Early Days of RedOwl

11 What questions are we really trying to answer? ▸ 
What is “risky” behavior? ▸  Which entities are being “risky”? Example: IP Theft ▸  What are the behaviors one might exhibit? •  Late night activity •  Use of internal keywords •  Use of attachments •  Data export ▸  How do we model these? Enter Elastic

12 Features ▸  Scoring events on ingest ▸  E.g. -
“late night”, “project keywords”, etc. Models ▸  Define what “risk” means ▸  Aggregate features by entity at runtime ▸  Pre-canned models capture most common definitions of “risk” Key Differentiators ▸  Lighter, faster, more configurable ▸  Makes customization possible Anomaly Detection with Elastic

Printed Data Breach: Event Document { “type”: “printlog”, “entities”: [
{“roleId”: “user”, “entity”: “John Doe”}, {“roleId”: “device”, “entity”: “MX-1234K”} ], “attachments”: [ {“name”: “client-list.xslx”, “content”: “Client,Addr,...”} ], “features”: { “late_night”: {“score”: 1, “percentile”: 0.999}, “client_names”: {“score”: 4.782, “percentile”: 0.971} } }

14 Function Score Query over Features ▸  Each feature gets
its own function ▸  Function scores combine to make model score ▸  Feeds into aggregation Filtered Query for Subsetting ▸  Since corpus represented by query, filter to score different groups of events •  e.g. - organizational unit, country, etc. ▸  Can subset by any event-level metadata Printed Data Breach: Query

15 Aggregate model score by user... ▸  Bucket by user
•  Nested (roles) •  Filter (role = user) •  Terms (role.entity) •  Reverse Nested ▸  Stats across “_score” •  We use sum or average ...or bucket by literally anything else. ▸  Create buckets across any event-level field •  e.g. - datacenter, domain, portfolio, etc. ▸  Great degree of flexibility Printed Data Breach: Aggregation

Printed Data Breach: Feature Model Query IP Theft 16 Query:
function_score( ) Aggregation: nested( ) Aggregation: filter( ) Aggregation: terms( ) Path: roles Query: roles.roleId:user Field: roles.entity Aggregation: reverse_nested() Aggregation: stats( ) Script: _score Function: field_value_factor( ) Score Mode: sum Field: features.late_night.percentile Function: field_value_factor( ) Field: features.client_names.percentile

17 Models can be defined after ingest ▸  Allows “risk”
to mean more/less anything ▸  Limited to features we can score Elasticsearch powers exploration of context ▸  E.g. - “Show me all of the riskiest entity’s activity” ▸  Denormalization makes this possible How do you have that much memory? ▸  We don’t, we use UX/UI/workflow ▸  Precludes users from doing dumb things Oh the Possibilities...

18 2016 Case Study on RedOwl ▸  Global private equity
firm ▸  Identified multiple cases of undesirable behavior •  Negligent sharing of information internally •  Malicious theft of private information by departing employees ▸  Differentiators •  Unstructured data analysis •  High-fidelity reporting •  Faster, comprehensive investigation RedOwl in Practice

19 Elastic Product Suite ▸  Marvel •  A true blessing
performance testing ▸  Shield •  User based “entitlements” ▸  Watcher •  Alerts on cluster stability and/or risk scores Elastic Support ▸  Quick Start w/ Architectural review ▸  Client confidence w/ Level 1 ▸  Engineering confidence w/ Level 3 RedOwl + Elastic OEM = <3

www.elastic.co 20

Except where otherwise noted, this work is licensed under h6p://crea:vecommons.org/licenses/by-nd/4.0/
Crea:ve Commons and the double C in a circle are registered trademarks of Crea:ve Commons in the United States and other countries. Third party marks and brands are the property of their respec:ve holders.

Detecting Insider Threats with Elastic

Detecting Insider Threats with Elastic

Elastic Co

More Decks by Elastic Co

Other Decks in Technology

Featured

Transcript

Adam Reeve, Principal Architect Russell Snyder, Principal Engineer Detecting Insider

Introduction Adam Reeve, Principal Architect

3 { } Joseph Blankenship, Forrester Research Treating insiders as

Result Action Recon Target Vulnerability Tool 4 Motivation Actor Employee,

5 Entitled Insider | IP Thief “Recruited by a competitor.

6 RedOwl Dashboard

7 Unstructured data ▸  Many formats, many sources Some common,

8 ETL Analytics Ingest Data Store Application Data Flow The

Identifying Risky Behavior Russell Snyder, Principal Engineer

10 Reveal Analytics Service (c. 2012) ▸  Custom Hadoop workflow

11 What questions are we really trying to answer? ▸

12 Features ▸  Scoring events on ingest ▸  E.g. -

Printed Data Breach: Event Document { “type”: “printlog”, “entities”: [

14 Function Score Query over Features ▸  Each feature gets

15 Aggregate model score by user... ▸  Bucket by user

Printed Data Breach: Feature Model Query IP Theft 16 Query:

17 Models can be defined after ingest ▸  Allows “risk”

18 2016 Case Study on RedOwl ▸  Global private equity

19 Elastic Product Suite ▸  Marvel •  A true blessing

www.elastic.co 20

Except where otherwise noted, this work is licensed under h6p://crea:vecommons.org/licenses/by-nd/4.0/