Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TPCiC 2021: ElasticSearch Exploration in Your Terminal

TPCiC 2021: ElasticSearch Exploration in Your Terminal

You've seen the pretty graphs. Visuals are great for signaling there is a problem somewhere in your system. How do you, a command line guru, go from pretty graphs to root cause analysis? Most likely you'll be reaching for paradigms from the command line: composability and a flexible, compact syntax to ask your questions. I'd like to talk more about integrating ElasticSearch-based dashboards back to the command line workflows I love.

This talk is an overview of a tool I developed while working at Booking.com to drastically reduce the time and complexity of performing incident reponse against rich, structured data in ElasticSearch. It was developed with the help of the security and fraud teams to perform ad-hoc queries critical for incident response. The tool served the team well and it's been under active development ever since. It continues to grow in capabilities aimed to make adhoc analysis simple, easy, and accessible to hardened command line jockeys and command line newbies.

Join me to learn how to bring the logging data you love back to your terminal!

Brad Lhotsky

June 09, 2021
Tweet

More Decks by Brad Lhotsky

Other Decks in Technology

Transcript

  1. ELASTICSEARCH DATA EXPLORATION IN YOUR TERMINAL Things you never knew

    you needed until it was too late. With your host, Brad Lhotsky
  2. WHY? @reyjrar

  3. WE HAVE KIBANA It’s an Elastic Product too! @reyjrar

  4. WE HAVE GRAFANA Such pretty, much datasources! @reyjrar

  5. AND NOW WE HAVE LOKI OR WHATEVER Flip between logs

    and graphs, oh my. @reyjrar
  6. WHY DRAG THE TERMINAL INTO THIS? @reyjrar

  7. MICE SLOW ME DOWN If you prefer a browser, cool.

    @reyjrar
  8. MY BROWSER IS NOT AN IDE ➤ Which browser? ➤

    Are you using privacy extensions? ➤ What happens when I hit “Backspace” ➤ Don’t get me started on “gestures” ➤ Bloaty and slow ➤ Prone to distraction @reyjrar
  9. THE CLI IS A WORKSPACE ➤ Which shell? ➤ Tab

    Autocomplete ➤ ReadLine ➤ dot fi les ➤ Access to a plethora of interoperable tools ➤ OK, I could MUD from my terminal ➤ Otherwise, fairly purpose built @reyjrar
  10. MY TERMINAL IS WHERE I WORK @reyjrar

  11. GUIDING LIGHTS Things that mean something to me

  12. “ There are a fi nite number of key strokes

    before you die, use them wisely. - A Wise Programmer
  13. UNIX PHILOSOPHY ➤ Do One Thing Well ➤ Assume output

    will be used as input and vice versa ➤ Favor the creation of tools or scripts, even for seemingly one-o ff jobs
  14. PERL ➤ Easy things easy, hard things possible ➤ Sloppy

    and unpredictable uses just like natural languages ➤ Grow with you ➤ DWIM ➤ TIMTOWDI ➤ The CPAN
  15. EXPLORATION being places you weren't intended to be

  16. I SAW THE POWER OF ELASTICSEARCH EARLY @reyjrar

  17. BUT IT WASN’T VERY CLI OR OPS FRIENDLY @reyjrar

  18. App::ElasticSearch::Utilities https://github.com/reyjrar/es-utils https://metacpan.org/pod/App::ElasticSearch::Utilities SO I DID A PERL @reyjrar

  19. ES-UTILS ➤ Monitoring ➤ Maintenance ➤ Status and Informational Tools

    ➤ Built as a Reusable Perl functional library ➤ ES Version Agnostic ➤ Assumes index-%Y.%m.%d index names ➤ And then came.., ➤ es-search.pl @reyjrar
  20. OPTIONAL: INSTALL PERLBREW # Install perlbrew curl -L https://install.perlbrew.pl \

    | bash # Setup perlbrew perlbrew install -j8 -n 5.34.0 perlbrew switch 5.34.0 perlbrew install-cpanm @reyjrar
  21. INSTALLATION cpanm App::ElasticSearch::Utilities @reyjrar

  22. ES-SEARCH.PL Bringing ElasticSearch to your terminal since 2012 @reyjrar

  23. GETTING STARTED: RTFM es-search.pl --help es-search.pl --manual @reyjrar

  24. GETTING STARTED: CONNECTING # Defaults es-search.pl --host localhost --port 9200

    # Connect to es-node01 es-search.pl --host es-node01 @reyjrar
  25. GETTING STARTED: CONNECT PREFERENCES cat ~/.es-utils.yml --- host: es-gateway.corp.company.com port:

    443 proto: https http-username: bob password-exec: ~/bin/get-es-password.sh @reyjrar
  26. SOME HELPFUL NOTES ➤ Searches are constrained by the calendar

    date in the index name ➤ Searches use “index base names” via --base logstash ➤ Use --days 7 for opening scope to 7 days ➤ Searches will stop once they receive --size 20 results ➤ Use --all to get all results across full timespan ➤ Sort order is descending, override with --asc ➤ Target a speci fi c index or alias with --index logstash-2019.10.21 @reyjrar
  27. GETTING STARTED: INDEX SELECTION # List index basenames $ es-search.pl

    --bases Bases available for search: access security syslog # Bases: 3 from a combined 61 indices. @reyjrar
  28. GETTING STARTED: SHOW ME MONEY DATA es-search.pl --base syslog --fields

    @reyjrar
  29. None
  30. GETTING STARTED: SHOW ME MONEY DATA es-search.pl --base syslog --size

    1 @reyjrar
  31. None
  32. GETTING STARTED: SET A DEFAULT "BASE" cat ~/.es-utils.yml --- host:

    es-gateway.corp.company.com base: log days: 1 @reyjrar
  33. GETTING STARTED: TIMESTAMP DETECTION # Specify timestamp es-search.pl --base log

    \ --timestamp timestamp cat ~/.es-utils.yml --- base: log timestamp: timestamp @reyjrar
  34. GETTING STARTED: TIMESTAMP PREFERENCES cat ~/.es-utils.yml --- base: log #

    Global default timestamp field timestamp: timestamp # Per base settings meta: logstash: timestamp: '@timestamp' @reyjrar
  35. None
  36. GETTING STARTED: SHOW ME MONEY DATA MORE LIKE LOGS #

    Show just selected fields es-search.pl --show hostname,program,message @reyjrar
  37. None
  38. GETTING STARTED: SHOW ME MATCHING DOCS # Show sshd logs

    es-search.pl program:sshd \ --show hostname,program,message @reyjrar
  39. None
  40. GETTING STARTED: MULTIPLE SEARCH PARAMETERS # Multiple parameters, AND'd es-search.pl

    program:sshd \ src_ip:181.206.20.11 \ --show hostname,program,message @reyjrar
  41. None
  42. OR MULTIPLE SEARCH PARAMETERS # Join dangling params with OR

    es-search.pl --or program:sshd \ src_ip:181.206.20.11 \ --show hostname,program,message # Join with OR explicitly es-search.pl program:sshd OR \ src_ip:181.206.20.11 \ --show hostname,program,message @reyjrar
  43. None
  44. GETTING STARTED: I WANT TO USE JQ # Make output

    pipe friendly to jq es-search.pl program:sshd --exists src_ip \ --jq \ | jq -r .src_ip | sort | uniq -c @reyjrar
  45. None
  46. EVEN MORE OPTIMIZATIONS Short cuts to save key strokes

  47. QUERY STRING EXTENSIONS: BARE WORDS # and, or, not uppercased

    es-search.pl not program:sshd @reyjrar
  48. None
  49. QUERY STRING EXTENSIONS: IP # Use CIDR Notation for IPs

    es-search.pl src_ip:102.0.0.0/8 \ --show hostname,program,src_ip,src_geoip \ --size 1 @reyjrar
  50. None
  51. QUERY STRING EXTENSIONS: RANGE # Range and range combos es-search.pl

    dst_port:'<1024' es-search.pl status:'<500,>=400' es-search.pl crit:'>5' \ --show hostname,program,crit,name,src_ip @reyjrar
  52. None
  53. QUERY STRING EXTENSIONS: TERMS PROMOTION # Don't stress the Lucene

    escapes es-search.pl =exe:'/usr/bin/yum update -y' @reyjrar
  54. None
  55. QUERY STRING EXTENSIONS: PREFIX # String prefixes es-search.pl _prefix_:exe:/usr --show

    exe @reyjrar
  56. None
  57. QUERY STRING EXTENSIONS: TERMS IN A FILE # Build terms

    from a TSV file, last column es-search.pl src_ip:badguys.dat # Build terms from a TSV file, first column es-search.pl src_ip:badguys.dat[0] # Build terms from a CSV file, last column es-search.pl src_ip:badguys.csv @reyjrar
  58. None
  59. QUERY STRING EXTENSIONS: TERMS IN A JSON DATA SET #

    Build terms from an NDJSON file .ip es-search.pl src_ip:threatfeed.json[ip] # Build terms from an NDJSON file nested field es-search.pl src_ip:threatfeed.json[actor.ip] @reyjrar
  60. CAN HAZ AGGREGATIONS i thought you'd never ask! @reyjrar

  61. AGGREGATION CAVEATS ➤ Supported during "facets" and ES 0.17 ➤

    Early versions of ES, up to v2.x were splodey ➤ Some limitations which I'm slowly rolling back ➤ per day ➤ Top aggregation must be a bucket ➤ Limited to 2 levels deep ➤ Well, 3 in a certain instance @reyjrar
  62. AGGREGATIONS: TOP THING # Top 20 programs es-search.pl --top program

    # Top 50 programs es-search.pl --top program --size 50 es-search.pl --top program --limit 50 es-search.pl --top program -n 50 @reyjrar
  63. None
  64. AGGREGATIONS: TOP THING PER HOUR # Top programs with a

    src_ip ever 8 hours es-search.pl --top program _exists_:src_ip \ --interval 8h @reyjrar
  65. None
  66. AGGREGATIONS: TOP THING WITH ANOTHER THING # Top action with

    the top 3 countries es-search.pl --top action \ --with src_geoip.country # Top action with the top 10 countries es-search.pl --top action \ --with src_geoip.country:10 @reyjrar
  67. None
  68. None
  69. AGGREGATIONS: TOP THING BY SOMETHING OTHER THAN DOC COUNT #

    Top src_ip by distinct dst countries es-search.pl --top src_ip \ --by cardinality:dst_geoip.country # Top dst_ip by the total traffic es-search.pl --top dst_ip \ --by sum:out_bytes @reyjrar
  70. None
  71. None
  72. AGGREGATIONS: WHERE'S MY DATA GOING # Top src_ip by the

    total traffic # With top dst_ip es-search.pl --top src_ip \ --by sum:out_bytes \ --with dst_ip:1 @reyjrar
  73. None
  74. AGGREGATIONS: STATISTICS ANYONE? # Top program by average parse time

    # with a statistical summary es-search.pl --top program \ --by avg:total_time \ --with stats:total_time @reyjrar
  75. None
  76. AGGREGATIONS: PERCENTILES, TOO # Top programs by average parse time

    # With median, 90, and 99th percentile es-search.pl --top program \ --by avg:total_time \ --with percentiles:total_time:50,90,99 @reyjrar
  77. None
  78. AGGREGATIONS: I GOT YOUR HISTOGRAMS # Top 20 uri by

    average render time # with histogram of 100ms es-search.pl --top program \ --by avg:total_time \ --with histogram:total_time:0.01 @reyjrar
  79. None
  80. AGGREGATIONS: I'M ALL ABOUT SIGNIFICANCE # Top 20 significant uri

    for search es-search.pl --top significant_terms:uri \ render_ms:>1000 src_country:US # Top 20 significant uri for search, # Background is only US es-search.pl --top significant_terms:uri \ render_ms:>1000 src_country:US \ --bg-filter src_country:US @reyjrar
  81. ONE MORE THING Well, maybe more than one more thing..

    @reyjrar
  82. BUILT WITH CLI::HELPERS ➤ General purpose, functional library for developing

    command line utilities in Perl ➤ Handles input ➤ Provides output customization including color support, 
 --color ➤ Allow output tagged as data to be redirected into a fi le, 
 --data-file=output.dat ➤ NoPaste support via App::NoPaste and --no-paste @reyjrar
  83. NOTES ON APP::NOPASTE ➤ CLI::Helpers will only paste to a

    service fl agged as "public" if you specify --no-paste- public ➤ Subclass an App::NoPaste::Service object for your internal paste service, it's pretty simple ➤ Easily share things with colleagues directly from the command line @reyjrar
  84. PUTTING SOME THINGS TOGETHER # Let's say we have a

    list of bad ip es-search.pl --top src_ip \ _prefix_:path:\/admin \ status:<400 \ src_ip:threatfeed.json[ip] \ --data-file=insidethehouse.dat @reyjrar
  85. PUTTING SOME THINGS TOGETHER # Dump a full log of

    what they've done es-search.pl src_ip:insidethehouse.dat --show src_ip,src_user,uri,out_bytes \ --all # Share with your colleagues es-search.pl src_ip:insidethehouse.dat --show src_ip,src_user,uri,out_bytes \ --all --no-paste @reyjrar
  86. PERL LIBRARY USAGE For those times when you want a

    custom fi t.. @reyjrar
  87. STEAL THIS CODE @reyjrar ➤ Modules make it easy for

    you to interact with ES ➤ App::ElasticSearch::Utilities::QueryString provides all the fun query extenions ➤ App::ElasticSearch::Query provides a simple interface to execute queries ➤ All of these draw on the con fi g fi le and command line switches of App::ElasticSearch::Utilities
  88. ILL FATED ATTEMPT AT LIVE DEMO IN 3. 2. 1.

  89. FUTURE PLANS ➤ Arbitrary levels of nested aggregations ➤ JSON

    output for aggregations ➤ Better support for nested documents ➤ Arbitrary data joins at query time: rdns, whois, db lookups, etc. ➤ <your idea here> @reyjrar
  90. Thank you! brad.lhotsky@gmail.com https://twitter.com/reyjrar https://github.com/reyjrar https://speakerdeck.com/reyjrar https://www.craigslist.org/about/craigslist_app @reyjrar