Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Splunk Chalk Talk

Splunk Chalk Talk

Chalk talk on Using Splunk for Vertica

Tim Hartmann

March 10, 2015
Tweet

More Decks by Tim Hartmann

Other Decks in Technology

Transcript

  1. What is Splunk? • Hugs and Unicorns • For those

    that wish to play the Splunk home game: • https://github.com/tfhartmann/ splunk101
  2. What Is Splunk? • An engine that indexes time series

    ASCII events, providing strong searching, reporting, and analytical tools. • A Sonic Screwdriver • What Splunk *isn’t* - a database
  3. User Searches Network Devices: Routers Switches Firewalls Network Appliances: TACACS

    VPN SourceFire StealthWatch Splunk Forwarders: Linux Solaris Windows Servers Phase 4, version 2 splunkcollector1 splunkcollector1 Various destinations: • Qradar • LCE • Nessus Pages Emails Filtered Selective unfiltered Unfiltered Opsware TCP UDP splunkindex1 splunksearch1 splunksearch2 splunksearch1 splunksearch2 splunkindex2 splunkindex3 splunkindex4 splunkindex1 splunkindex2 splunkindex3 splunkindex4 What is Splunk? • Events are sent into indexers via agents, files, or by listening on open ports • Users interface with Splunk through the GUI, CLI, or API on “Search Heads”
  4. Track -A- MAC • As a SysAdmin I want to

    automate tracking and notification of stolen hardware so that I can notify alert the Police if it it shows up on campus. • When a laptop is stolen it will either stay on the campus network, or join a managed network somewhere in town, when this happens we want to notify the campus PD so that they can respond. • Originally this process was manual, or a cron job that polled network devices grep’ing for a match. This then started a chain of manual procedures to notify the appropriate officer.
  5. Track -A- MAC v2.0 • Search was critical to how

    this was done in automating the process. • Notifications: • Splunk allows for built in notifications through emails and custom scripts • Sub Searches which allowed us to push a csv from another app (Essential) • Lookups allowed us to integrate multiple tools within a single search query • Campus Map API • Custom “Who Is” app which looked up switch information
  6. Track -A- MAC - Impact • By Automating the process

    we were able to: • Quickly alert the appropriate authorities and recover stolen hardware. • Provide a self service portal for the PD and Security and streamline the process of adding stolen hardware to the watch list.
  7. Spammers!   • As an email admin I want to

    locate compromised accounts so that the mail performance doesn’t suffer when spammers compromise a user account to send spam. • Our UNIX Team ran into an issue where they were alerted that mail performance was suffering. Most often caused by a compromised user account being used to relay spam from the mail servers.
  8. Spammers!   • Using standard deviation of login events was

    essential to catching spammers early. • The UNIX pipeline style of the search language allowed administrators without a statistics background to use existing tools and data. • The ability to trigger an external account lockout event through a script was essential
  9. Spammers!  -­‐  Impact • By analyzing users actual usage patterns

    and acting when that pattern changed erratically we were able to address a problem before it became service impacting. • We used role based access controls to ensure that sensitive data was only accessible to the appropriate teams. • Our SysAdmin were happier; hugs and Unicorns for all!
  10. DMCA Policy Violations • As a IT security officer I

    want to respond to DMCA (Digital Millennium Copyright Act ) violation notices so that I can comply with the organizations legal obligations • receiving and responding to DMCA notifications was a long and arduous process. Emails were parsed by the mail servers and forwarded to the Cyber Security team, who had to track back the offending IP address to a particular person. • A string of IT Security people, a dedicated lawyer, and multiple operations people were required to track back the relevant information • Parts of the process were automated but fragile and prone to breaking
  11. DMCA Policy Violations v2.0 • To address this we used

    Splunk to map back, *who* had *what* IP address when. • Searching across multiple data sources, DNS, DHCP, PAT/NAT logs, wireless access point, and routers and switches was critical • Lookup were critical in mapping MAC, IP Address, and ports back to a registered user. • macros were essential and allowed us to create custom functions that we could present through the API to security team. • REST API was essential in providing a constant interface for external tools to integrate with
  12. DMCA Policy Violations Impact • By completely automating the process

    we were able to speed the process up, freeing up staff time. • Increase the the hit rate to 100% • Remove false positives • Provide self services interfaces to our Security team though the API and RBAC
  13. Visualizing Network Data • As a Network Manager I want

    to know who is using eduRoam so that I can plan for the future.
  14. VPN Usage • As the NOC I want to dashboards

    on the status of our VPN service so that I can publish our service status to the IT community.
  15. NetFlow / Binary Data • As a Network/Security Engineer I

    want to perform analysis on NetFlow Data so that I can determine patterns in network traffic and the effects • NetFlow is a Cisco binary protocol used to analyze network flows: a sequence of packets from a source computer to a destination • Although NetFlow can be translated into text output in much the same way as syslog, Splunk can not natively ingest NetFlow streams.
  16. NetFlow / Binary Data • The volume of data produced

    by most networks is large enough to require larger Splunk infrastructures and more storage. • Splunk is licensed by volume of data ingested per day, which makes NetFlow data often prohibitively expensive. • Although we wanted to use Splunk the obstacles of ingesting a binary source data and license cost pushed us towards other products like flow analyzers like Stealthwatch
  17. Summary (Pro’s) • Splunk is a great tool for discovering

    and making use of chaotic time based event streams • Case 1: Track -A- MAC we are able to correlate disparate data sources, preform external lookups to enrich the data and then trigger actions in an automated way to help recover stolen property. • Case 2: Compromised Email accounts. Splunks presentation of common statistical tools allowed nonspecialists to preform more in depth analysis of the event stream in order to increase service availability. • Easy to Install and configure - a Single RPM and configuration is done through INI Files. • UNIX style pipeline, the ease in which data is ingested, and the ability to enrich the data allowed for a lot flexibility. • Role Based Access Control allowed us to comply with policies and secure our data.
  18. Wait… what about.. • When we initially started using Splunk,

    we were also considering the ArcSite logger. • ArcSite proved difficult to configure even with professional services • The logger application was Slooooow • Connectors proved fragile and unless events were standardized in a particular way didn’t match • Initially in our environment the was a strongly configured syslog-ng server that Arcsite was never able to consume • We weren’t easily able to answer many ‘open ended’ questions
  19. Summary (Con’s) • Splunk is not a great tool for

    consuming binary data. • Splunk licensing is a large deterrent to adoption, especially when the data volumes are high and the user case are open ended, for example DNS query logs or NetFlow data • Splunk is not a great tool when data sets aren’t marked with a time stamp. • While flat text files can be indexed and searched, Splunk treats the event like a time based event.