Slide 1

Slide 1 text

Splunk Spark Integration Gang Tao

Slide 2

Slide 2 text

About Me • Software Engineer with 15+ Years experience • Now architect working on Data acquisition and Cloud App • Used to be working on BI, ERP and other Enterprise application development • Like data science and open source

Slide 3

Slide 3 text

Splunk'Company'Overview' 3" Company'' •  Global"HQs:"" !  San"Francisco" !  London"" !  Hong"Kong" •  1,800+"employees"globally" •  Annual"Revenue:" $450.9M"(YoY"+49%)" •  NASDAQ:"SPLK" Products' •  Free"trial"to"massive"scale" •  Splunk"products:"" !  Splunk"Enterprise" !  Splunk"Cloud" !  Hunk" !  Splunk"Light" !  Splunk"MINT" !  Premium"SoluWons" Customers'' •  10,000+"customers" •  Across"100"countries" •  Small"to"large" organizaWons" •  More"than"80"of"the" Fortune"100" •  Largest"license:"" !  400+"Terabytes/day"

Slide 4

Slide 4 text

Splunk'–'a'Data'Pla-orm' Mainframe) Data) VMware) Pla0orm)for)Machine)Data) Exchange) PCI) Security) Rela=onal) Databases) Mobile) Forwarders) Syslog)/)) TCP)/)Other) Sensors)&) Control)Systems) Wire)) Data) Mobile)Intel) Splunk'Premium'Apps' Rich'Ecosystem'of'Apps' MINT' ) Splunk - a Machine Data Platform

Slide 5

Slide 5 text

Demo

Slide 6

Slide 6 text

Splunk Technical Stack Presenting Processing Store Acquisition

Slide 7

Slide 7 text

Splunk Deployment Architecture Indexer
 store  data,  transform  row  data  into   events  and  searches  the  indexed   data  in  response  to  search   requests.   Search  Head
 directs  search  requests  to  a  set  of   indexers,  merges  the  results  and   presents  them  to  the  user   Forwarder
 get  data  into  indexers  

Slide 8

Slide 8 text

Splunk VS Open Source

Slide 9

Slide 9 text

Splunk VS Open Source

Slide 10

Slide 10 text

SQL of Machine Data - SPL SPL  –  Splunk  Processing  Language   SQL   *nix  Pipe   Google  Search

Slide 11

Slide 11 text

Extensibility - Splunk App h0p://apps.splunk.com/     Enterprise  Security   ITSI   DB  Connect   Technology  Add-­‐ons

Slide 12

Slide 12 text

Why Integration? • Splunk to Spark • Data Ingestion • Unstructure/Semi Structure data Indexing • Data processing with Splunk search • Data Presenting • Spark to Splunk • Powerful computing capability • Machine Learning • Open Source community

Slide 13

Slide 13 text

Solution A

Slide 14

Slide 14 text

Solution B

Slide 15

Slide 15 text

Solution C Indexer Virtual Indexer (Spark) SPL Enhanced Search Command Spark Driver (SPL Parser) Spark Worker Spark Worker Spark Worker

Slide 16

Slide 16 text

Challenges • Avoid big data movement • keep good user experience • Adapt to SPL concept