Slide 1

Slide 1 text

ELASTICSEARCH DATA EXPLORATION IN YOUR TERMINAL Things you never knew you needed until it was too late @reyjrar

Slide 2

Slide 2 text

TRIGGER WARNING 4 - 7 - 8 (Inhale) - (Hold) - (Exhale)

Slide 3

Slide 3 text

WHY? @reyjrar

Slide 4

Slide 4 text

WE HAVE KIBANA It’s an Elastic Product too! @reyjrar

Slide 5

Slide 5 text

WE HAVE GRAFANA Such pretty, much datasources! @reyjrar

Slide 6

Slide 6 text

AND NOW WE HAVE LOKI OR WHATEVER Flip between logs and graphs, oh my. @reyjrar

Slide 7

Slide 7 text

WHY DRAG THE TERMINAL INTO THIS? @reyjrar

Slide 8

Slide 8 text

MICE SLOW ME DOWN If you prefer a browser, cool. @reyjrar

Slide 9

Slide 9 text

MY BROWSER IS NOT AN IDE ➤ Which browser? ➤ Are you using privacy extensions? ➤ What happens when I hit “Backspace” ➤ Don’t get me started on “gestures” ➤ Bloaty and slow ➤ Prone to distraction @reyjrar

Slide 10

Slide 10 text

THE CLI IS A WORKSPACE ➤ Which shell? ➤ Tab Autocomplete ➤ ReadLine ➤ dotfiles ➤ Access to a plethora of interoperable tools ➤ OK, I could MUD from my terminal ➤ Otherwise, fairly purpose built @reyjrar

Slide 11

Slide 11 text

MY TERMINAL IS WHERE I WORK @reyjrar

Slide 12

Slide 12 text

GUIDING LIGHTS Things that mean something to me

Slide 13

Slide 13 text

“ There are a finite number of key strokes before you die, use them wisely. - A Wise Programmer

Slide 14

Slide 14 text

UNIX PHILOSOPHY ➤ Do One Thing Well ➤ Assume output will be used as input and vice versa ➤ Favor the creation of tools or scripts, even for seemingly one-off jobs

Slide 15

Slide 15 text

PERL ➤ Easy things easy, hard things possible ➤ Sloppy and unpredictable uses just like natural languages ➤ Grow with you ➤ DWIM ➤ TIMTOWDI ➤ The CPAN

Slide 16

Slide 16 text

EXPLORATION being places you weren't intended to be

Slide 17

Slide 17 text

I SAW THE POWER OF ELASTICSEARCH EARLY

Slide 18

Slide 18 text

BUT IT WASN’T VERY CLI OR OPS FRIENDLY

Slide 19

Slide 19 text

App::ElasticSearch::Utilities https://github.com/reyjrar/es-utils https://metacpan.org/pod/App::ElasticSearch::Utilities SO I DID A PERL @reyjrar

Slide 20

Slide 20 text

ES-UTILS ➤ Monitoring ➤ Maintenance ➤ Status and Informational Tools ➤ Built as a Reusable Perl functional library ➤ ES Version Agnostic ➤ Assumes index-%Y.%m.%d index names ➤ And then came.., ➤ es-search.pl @reyjrar

Slide 21

Slide 21 text

OPTIONAL: INSTALL PERLBREW # Install perlbrew curl -L https://install.perlbrew.pl \ | bash # Setup perlbrew perlbrew install -j8 -n 5.30.0 perlbrew switch 5.30.0 perlbrew install-cpanm @reyjrar

Slide 22

Slide 22 text

INSTALLATION cpanm App::ElasticSearch::Utilities @reyjrar

Slide 23

Slide 23 text

ES-SEARCH.PL Bringing ElasticSearch to your terminal since 2012 @reyjrar

Slide 24

Slide 24 text

GETTING STARTED: RTFM es-search.pl --help es-search.pl --manual @reyjrar

Slide 25

Slide 25 text

ILL ADVISED LIVE DEMO

Slide 26

Slide 26 text

GETTING STARTED: CONNECTING # Defaults es-search.pl --host localhost --port 9200 # Connect to es-node01 es-search.pl --host es-node01 @reyjrar

Slide 27

Slide 27 text

GETTING STARTED: CONNECT PREFERENCES cat ~/.es-utils.yml --- host: es-gateway.corp.company.com port: 443 proto: https http-username: bob password-exec: ~/bin/get-es-password.sh @reyjrar

Slide 28

Slide 28 text

SOME HELPFUL NOTES ➤ Searches are constrained by the calendar date in the index name ➤ Use --days 7 for opening scope to 7 days ➤ Searches will stop once they receive --size 20 results ➤ Use --all to get all results across full timespan ➤ Sort order is descending, override with --asc ➤ Target a specific index with --index logstash-2019.10.21 @reyjrar

Slide 29

Slide 29 text

BOUND TO FAIL LIVE DEMO

Slide 30

Slide 30 text

GETTING STARTED: INDEX SELECTION # List index basenames $ es-search.pl --bases Bases available for search: access security syslog # Bases: 1 from a combined 61 indices.

Slide 31

Slide 31 text

GETTING STARTED: SHOW ME MONEY DATA # Show all the fields in the base es-search.pl --base log --fields # Show most recently indexed doc es-search.pl --base log --size 1 @reyjrar

Slide 32

Slide 32 text

GETTING STARTED: SET A DEFAULT "BASE" cat ~/.es-utils.yml --- host: es-gateway.corp.company.com base: log days: 1 @reyjrar

Slide 33

Slide 33 text

GETTING STARTED: TIMESTAMP DETECTION # Specify timestamp es-search.pl --base log \ --timestamp timestamp cat ~/.es-utils.yml --- base: log timestamp: timestamp @reyjrar

Slide 34

Slide 34 text

GETTING STARTED: TIMESTAMP PREFERENCES cat ~/.es-utils.yml --- base: log # Global default timestamp field timestamp: timestamp # Per base settings meta: logstash: timestamp: '@timestamp'

Slide 35

Slide 35 text

GETTING STARTED: SHOW ME MONEY DATA QUICKER # Show most recently indexed doc es-search.pl --size 1 @reyjrar

Slide 36

Slide 36 text

GETTING STARTED: SHOW ME MONEY DATA MORE LIKE LOGS # Show just selected fields es-search.pl --show hostname,program,message @reyjrar

Slide 37

Slide 37 text

GETTING STARTED: SEARCH FOR MATCHING DOCS # Search for program:sshd es-search.pl program:sshd @reyjrar

Slide 38

Slide 38 text

GETTING STARTED: COMPLEX QUERIES # Search for sshd and ip 1.2.3.4 es-search.pl program:sshd AND src_ip:1.2.3.4 @reyjrar

Slide 39

Slide 39 text

GETTING STARTED: SEARCH OPTIMIZATIONS # Search for sshd and ip 1.2.3.4 es-search.pl program:sshd src_ip:1.2.3.4 App::ElasticSearch::Utilities::QueryString uses a default join for dangling search terms of 'AND' # Search for sshd or ip 1.2.3.4 es-search.pl --or program:sshd src_ip:1.2.3.4 @reyjrar

Slide 40

Slide 40 text

GETTING STARTED: I WANT TO USE JQ # Make output pipe friendly to jq es-search.pl program:sshd --exists src_ip \ --jq | jq .src_ip | sort | uniq -c @reyjrar

Slide 41

Slide 41 text

EVEN MORE OPTIMIZATIONS Short cuts to save key strokes

Slide 42

Slide 42 text

WHY WOULD YOU TRY A LIVE DEMO

Slide 43

Slide 43 text

QUERY STRING EXTENSIONS: BARE WORDS # and, or, not uppercased es-search.pl not program:sshd @reyjrar

Slide 44

Slide 44 text

QUERY STRING EXTENSIONS: IP # Use CIDR Notation for IPs es-search.pl src_ip:10.0.0.0/8 @reyjrar

Slide 45

Slide 45 text

QUERY STRING EXTENSIONS: RANGE # Range and range combos es-search.pl dst_port:'<1024' es-search.pl status:'<500,>=400' @reyjrar

Slide 46

Slide 46 text

QUERY STRING EXTENSIONS: TERMS PROMOTION # Don't stress the Lucene escapes es-search.pl =exec:/bin/bash @reyjrar

Slide 47

Slide 47 text

QUERY STRING EXTENSIONS: PREFIX # String prefixes es-search.pl _prefix_:user_agent:"Go" @reyjrar

Slide 48

Slide 48 text

QUERY STRING EXTENSIONS: TERMS IN A FILE # Build terms from a TSV file, last column es-search.pl src_ip:badguys.dat # Build terms from a TSV file, first column es-search.pl src_ip:badguys.dat[0] # Build terms from a CSV file, last column es-search.pl src_ip:badguys.csv @reyjrar

Slide 49

Slide 49 text

QUERY STRING EXTENSIONS: TERMS IN A JSON DATA SET # Build terms from an NDJSON file .ip es-search.pl src_ip:threatfeed.json[ip] # Build terms from an NDJSON file nested field es-search.pl src_ip:threatfeed.json[actor.ip] @reyjrar

Slide 50

Slide 50 text

CAN HAZ AGGREGATIONS i thought you'd never ask!

Slide 51

Slide 51 text

AGGREGATION CAVEATS ➤ Supported during "facets" and ES 0.17 ➤ Early versions of ES, up to v2.x were splodey ➤ Some limitations which I'm slowly rolling back ➤ per day ➤ Top aggregation must be a bucket ➤ Limited to 2 levels deep ➤ Well, 3 in a certain instance

Slide 52

Slide 52 text

TIME TO BURN SAGE LIVE DEMO

Slide 53

Slide 53 text

AGGREGATIONS: TOP THING # Top 20 5xx-ing uri es-search.pl --top uri status:>=500 # Top 50 5xx-ing uri es-search.pl --top uri status:>=500 --size 50 es-search.pl --top uri status:>=500 --limit 50 es-search.pl --top uri status:>=500 -n 50 @reyjrar

Slide 54

Slide 54 text

AGGREGATIONS: TOP THING PER HOUR # Top 20 uri per hour es-search.pl --top uri --interval 1h @reyjrar

Slide 55

Slide 55 text

AGGREGATIONS: TOP THING WITH ANOTHER THING # Top 20 uri with the top 3 countries es-search.pl --top uri --with src_country # Top 20 uri with the top 10 countries es-search.pl --top uri --with src_country:10 @reyjrar

Slide 56

Slide 56 text

AGGREGATIONS: TOP THING BY SOMETHING OTHER THAN DOC COUNT # Top 20 uri by the cardinality of country es-search.pl --top uri \ --by cardinality:src_country # Top 20 ip by the total traffic es-search.pl --top src_ip \ --by sum:out_bytes @reyjrar

Slide 57

Slide 57 text

AGGREGATIONS: WHERE'S MY DATA GOING # Top 20 ip by the total traffic # With top uri's es-search.pl --top src_ip \ --by sum:out_bytes \ --with uri:1 @reyjrar

Slide 58

Slide 58 text

AGGREGATIONS: STATISTICS ANYONE? # Top 20 uri by average render time # with a statistical summary es-search.pl --top uri \ --by avg:render_ms \ --with stats:render_ms @reyjrar

Slide 59

Slide 59 text

AGGREGATIONS: PERCENTILES, TOO # Top 20 uri by average render time # With median, 90, and 99th percentile es-search.pl --top uri \ --by avg:render_ms \ --with percentiles:render_ms:50,90,99 @reyjrar

Slide 60

Slide 60 text

AGGREGATIONS: I GOT YOUR HISTOGRAMS # Top 20 uri by average render time # with histogram of 100ms es-search.pl --top uri \ --by avg:render_ms \ --with histogram:render_ms:100 @reyjrar

Slide 61

Slide 61 text

AGGREGATIONS: I'M ALL ABOUT SIGNIFICANCE # Top 20 significant uri for search es-search.pl --top significant_terms:uri \ render_ms:>1000 src_country:US # Top 20 significant uri for search, # Background is only US es-search.pl --top significant_terms:uri \ render_ms:>1000 src_country:US \ --bg-filter src_country:US

Slide 62

Slide 62 text

ONE MORE THING Well, maybe more than one more thing..

Slide 63

Slide 63 text

BUILT WITH CLI::HELPERS ➤ General purpose, functional library for developing command line utilities in Perl ➤ Handles input ➤ Provides output customization including color support, --color ➤ Allow output tagged as data to be redirected into a file, --data-file=output.dat ➤ NoPaste support via App::NoPaste and --no-paste @reyjrar

Slide 64

Slide 64 text

NOTES ON APP::NOPASTE ➤ CLI::Helpers will only paste to a service flagged as "public" if you specify --no-paste- public ➤ Subclass an App::NoPaste::Service object for your internal paste service, it's pretty simple ➤ Easily share things with colleagues directly from the command line @reyjrar

Slide 65

Slide 65 text

PUTTING SOME THINGS TOGETHER # Let's say we have a list of bad ip es-search.pl --top src_ip \ _prefix_:path:\/admin \ status:<400 \ src_ip:threatfeed.json[ip] \ --data-file=insidethehouse.dat @reyjrar

Slide 66

Slide 66 text

PUTTING SOME THINGS TOGETHER # Dump a full log of what they've done es-search.pl src_ip:insidethehouse.dat --show src_ip,src_user,uri,out_bytes \ --all # Share with your colleagues es-search.pl src_ip:insidethehouse.dat --show src_ip,src_user,uri,out_bytes \ --all --no-paste @reyjrar

Slide 67

Slide 67 text

FUTURE PLANS ➤ Arbitrary levels of nested aggregations ➤ JSON output for aggregations ➤ Better support for nested documents ➤ Arbitrary data joins at query time: rdns, whois, db lookups, etc. ➤ @reyjrar

Slide 68

Slide 68 text

Thank you! [email protected] https://twitter.com/reyjrar https://github.com/reyjrar https://speakerdeck.com/reyjrar https://www.craigslist.org/about/craigslist_is_hiring https://www.craigslist.org/about/cl_app_beta