Slide 1

Slide 1 text

Wiff: The Wayfair Sniffer Dan Rowe wayfair.com May 6, 2014

Slide 2

Slide 2 text

‹#› Who am I? ! ! Dan Rowe Wayfair.com @draco2002 http://github/draco2003 ! Lead the Infrastructure Tools Teams - InternalTools : Customers are Employees - DevTools : Customers are Engineers ! Next Monitorama in New England? Boston 2015? I’ll bring reptiles :) !

Slide 3

Slide 3 text

‹#› Even Cats like Tegus

Slide 4

Slide 4 text

Primary Engineers: Shawn Nichols and Nishan Subedi http://github.com/shnichols http://github.com/nishansubedi

Slide 5

Slide 5 text

‹#› Who is Wayfair? • Online retailer of home goods. • Offers more than 7 million products. • More than 16 million site visitors per month. • In the past year the company grew 55%. • 2013 sales reached $915 million.

Slide 6

Slide 6 text

‹#› Setting the Wayfair Environment Stage High level pieces of the puzzle: * Active / Active DC * Primarily Load-balancer -> PHP WebFarm * Everything else is a Heterogeneous Environment (PHP, Python, .NET, Java, Appliances running on Linux and Windows)

Slide 7

Slide 7 text

‹#› Logging Overview Syslog Commits Network Traffic App Logs (gelf) Unique Request ID Customer ID Files Involved Traffic Involved

Slide 8

Slide 8 text

‹#› Monitoring / Alerting Overview Syslog App Logs (gelf) Commits Network Traffic Ad Hoc Query Alerts HUD Dashboards

Slide 9

Slide 9 text

So how about Wiff already??

Slide 10

Slide 10 text

‹#› What is Wiff? • Out of band network traffic sniffer and analyzer. • Kind of like Wireshark as a service. • Currently in production as Beta. ! ! ! ! ! ! ! ! ! ! • Essentially it is a Packet processing pipeline.

Slide 11

Slide 11 text

‹#› Super High Level Overview • Feed Packets in • Process Packets • Feed Data out • Report/Analyze • $$$ Profit $$$

Slide 12

Slide 12 text

‹#› Feed Packets in • Packets can be fed into Wiff in multiple ways. • Network interface • pcap file (or ring buffer of tcpdump files.) • RabbitMQ • egress or ingress traffic, if they are packets, we'll take'em all.

Slide 13

Slide 13 text

‹#› Process Packets • Based on protocol and processors enabled, it sifts through the packets. • Currently HTTP, HTTPS* and basic TCP are supported.

Slide 14

Slide 14 text

‹#› HTTPS • Requires keys to the kingdom • Need to map IP to key file in config • Not all SSL ciphers supported, but most are easy to add. • We don't store request or response bodies, but you can… • This is alpha as we improve performance at full volume.

Slide 15

Slide 15 text

‹#› Our typical HTTP Processing Flow • Packets are fed in ! • Wiff keeps track of connections ! • Orders the packets by sequence number ! • Stitches the payloads ! • Decrypting if needed. ! • The stream is then parsed into a response / request pair and sent to Elasticsearch

Slide 16

Slide 16 text

‹#› Feed Data Out • Reporters are used to send the processed data somewhere. • Our primary usage is send to Elasticsearch (via RabbitMQ) • Parse the stitched tcpstream into JSON Object of request / response pair. ! • Example reporter for sending to Elasticsearch for Windows/Low volume usage. !

Slide 17

Slide 17 text

‹#› Reporting / Analyzing / Alerting • Wiff is only the beginning of the pipeline. ! • Kibana friendly data format • Example/Pre-configured dashboards coming soon. ! • It’s in Elasticsearch, analyze to your hearts content. ! • Alert: • Tattle for Elasticsearch? (that's another talk ;) ) • Whatever you use now for alerting from ES queries.

Slide 18

Slide 18 text

‹#› Yeah great, whatever… • Webserver X can log this data • Application Y can log this data • Wiff is a companion tool • Not a replacement for logging at lower levels.

Slide 19

Slide 19 text

‹#› Where does it go? • You tell me • Fits where you need it. • Different configuration scenarios. • Choose your own adventure.

Slide 20

Slide 20 text

Configuration: In front of the Load Balancer

Slide 21

Slide 21 text

‹#› Benfit: See what others can't • Who sees the errors or logs if the load balancer is mis- configured or erroring? (Other than the customer) • Web servers can only log the requests they see. • Web servers can only log the requests they complete. • Apache / Nginx don't write log line on segfault, etc.. • Application can only log requests they complete. • Logging not up high enough when needed? set-cookie anyone?

Slide 22

Slide 22 text

‹#› Benefit: Realtime traffic monitoring • Gives realtime visibility into all traffic. • Without slowing anything down • Without the need to change any other systems

Slide 23

Slide 23 text

‹#› Benefit: Out of band MOAWSL • Some environments have • a farm of web servers handling requests. • multiple types of web servers handling requests. • appliances handling some portion of requests. • lots of different log formats. ! • Single Pane of glass/Single format of data.

Slide 24

Slide 24 text

Configuration: Outbound traffic watcher

Slide 25

Slide 25 text

‹#› Benefits : On the box reporting / Monitoring • Runs on Windows boxes to watch proprietary software. • Third Party Appliance / External api call latency • Packet RTT • Frequency of requests • Tracking / Investigating desktop traffic.

Slide 26

Slide 26 text

‹#› Demo (screenshots)

Slide 27

Slide 27 text

‹#› Demo (screenshots)

Slide 28

Slide 28 text

‹#› Demo (screenshots)

Slide 29

Slide 29 text

‹#› Demo (screenshots)

Slide 30

Slide 30 text

‹#› ToDos: • Improve SSL Decryption Performance. • Roll out and test distributed processing for scaling. • Add additional Protocol Parsers (SMTP, FTP, DNS, etc…) • Add additional Reporters

Slide 31

Slide 31 text

‹#› Notes: • Monitor dropped packets to reduce un-stichable requests. • HTTP Parser does not currently support SPDY or Websockets. • Your mileage may vary, pull requests for your environment welcome.

Slide 32

Slide 32 text

‹#› Thanks to all the creators of the Images used in the presentation: https://www.flickr.com/photos/intvgene/370973576 http://commons.wikimedia.org/wiki/File:Sausage_making-H-4.JPG https://www.flickr.com/photos/mevs/4607680584/ https://www.etsy.com/listing/175624772/ http://commons.wikimedia.org/wiki/File:Master_lock.JPG http://en.wikipedia.org/wiki/File:Colombo.Express.wmt.jpg http://en.wikipedia.org/wiki/File:Report-edit.svg http://malc50.blogspot.com/2011/12/whatever.html https://www.flickr.com/photos/tt2times/2568645910/ http://en.wikipedia.org/wiki/File:I-80_Eastshore_Fwy.jpg http://www.thecatsite.com/t/195403/bastian-and-the-tegu http://www.flickr.com/photos/streamishmc/4793978336/ http://hikethegiant.blogspot.com/2010/08/round-top-mountain-kennebec- highlands.html ! !

Slide 33

Slide 33 text

Yes we are hiring: http://wayfair.com/careers Tell them DRowe sent you! Checkout the repo at: https://github.com/wayfair/wiff ! If you don’t want to build it yourself, we’ve tagged a release so you can grab the jar https://github.com/wayfair/wiff/releases/