Slide 1

Slide 1 text

ELASTICSEARCH DATA EXPLORATION IN YOUR TERMINAL Things you never knew you needed until it was too late. With your host, Brad Lhotsky

Slide 2

Slide 2 text

WHY? @reyjrar

Slide 3

Slide 3 text

WE HAVE KIBANA It’s an Elastic Product too! @reyjrar

Slide 4

Slide 4 text

WE HAVE GRAFANA Such pretty, much datasources! @reyjrar

Slide 5

Slide 5 text

AND NOW WE HAVE LOKI OR WHATEVER Flip between logs and graphs, oh my. @reyjrar

Slide 6

Slide 6 text

WHY DRAG THE TERMINAL INTO THIS? @reyjrar

Slide 7

Slide 7 text

MICE SLOW ME DOWN If you prefer a browser, cool. @reyjrar

Slide 8

Slide 8 text

MY BROWSER IS NOT AN IDE ➤ Which browser? ➤ Are you using privacy extensions? ➤ What happens when I hit “Backspace” ➤ Don’t get me started on “gestures” ➤ Bloaty and slow ➤ Prone to distraction @reyjrar

Slide 9

Slide 9 text

THE CLI IS A WORKSPACE ➤ Which shell? ➤ Tab Autocomplete ➤ ReadLine ➤ dot fi les ➤ Access to a plethora of interoperable tools ➤ OK, I could MUD from my terminal ➤ Otherwise, fairly purpose built @reyjrar

Slide 10

Slide 10 text

MY TERMINAL IS WHERE I WORK @reyjrar

Slide 11

Slide 11 text

GUIDING LIGHTS Things that mean something to me

Slide 12

Slide 12 text

“ There are a fi nite number of key strokes before you die, use them wisely. - A Wise Programmer

Slide 13

Slide 13 text

UNIX PHILOSOPHY ➤ Do One Thing Well ➤ Assume output will be used as input and vice versa ➤ Favor the creation of tools or scripts, even for seemingly one-o ff jobs

Slide 14

Slide 14 text

PERL ➤ Easy things easy, hard things possible ➤ Sloppy and unpredictable uses just like natural languages ➤ Grow with you ➤ DWIM ➤ TIMTOWDI ➤ The CPAN

Slide 15

Slide 15 text

EXPLORATION being places you weren't intended to be

Slide 16

Slide 16 text

I SAW THE POWER OF ELASTICSEARCH EARLY @reyjrar

Slide 17

Slide 17 text

BUT IT WASN’T VERY CLI OR OPS FRIENDLY @reyjrar

Slide 18

Slide 18 text

App::ElasticSearch::Utilities https://github.com/reyjrar/es-utils https://metacpan.org/pod/App::ElasticSearch::Utilities SO I DID A PERL @reyjrar

Slide 19

Slide 19 text

ES-UTILS ➤ Monitoring ➤ Maintenance ➤ Status and Informational Tools ➤ Built as a Reusable Perl functional library ➤ ES Version Agnostic ➤ Assumes index-%Y.%m.%d index names ➤ And then came.., ➤ es-search.pl @reyjrar

Slide 20

Slide 20 text

OPTIONAL: INSTALL PERLBREW # Install perlbrew curl -L https://install.perlbrew.pl \ | bash # Setup perlbrew perlbrew install -j8 -n 5.34.0 perlbrew switch 5.34.0 perlbrew install-cpanm @reyjrar

Slide 21

Slide 21 text

INSTALLATION cpanm App::ElasticSearch::Utilities @reyjrar

Slide 22

Slide 22 text

ES-SEARCH.PL Bringing ElasticSearch to your terminal since 2012 @reyjrar

Slide 23

Slide 23 text

GETTING STARTED: RTFM es-search.pl --help es-search.pl --manual @reyjrar

Slide 24

Slide 24 text

GETTING STARTED: CONNECTING # Defaults es-search.pl --host localhost --port 9200 # Connect to es-node01 es-search.pl --host es-node01 @reyjrar

Slide 25

Slide 25 text

GETTING STARTED: CONNECT PREFERENCES cat ~/.es-utils.yml --- host: es-gateway.corp.company.com port: 443 proto: https http-username: bob password-exec: ~/bin/get-es-password.sh @reyjrar

Slide 26

Slide 26 text

SOME HELPFUL NOTES ➤ Searches are constrained by the calendar date in the index name ➤ Searches use “index base names” via --base logstash ➤ Use --days 7 for opening scope to 7 days ➤ Searches will stop once they receive --size 20 results ➤ Use --all to get all results across full timespan ➤ Sort order is descending, override with --asc ➤ Target a speci fi c index or alias with --index logstash-2019.10.21 @reyjrar

Slide 27

Slide 27 text

GETTING STARTED: INDEX SELECTION # List index basenames $ es-search.pl --bases Bases available for search: access security syslog # Bases: 3 from a combined 61 indices. @reyjrar

Slide 28

Slide 28 text

GETTING STARTED: SHOW ME MONEY DATA es-search.pl --base syslog --fields @reyjrar

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

GETTING STARTED: SHOW ME MONEY DATA es-search.pl --base syslog --size 1 @reyjrar

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

GETTING STARTED: SET A DEFAULT "BASE" cat ~/.es-utils.yml --- host: es-gateway.corp.company.com base: log days: 1 @reyjrar

Slide 33

Slide 33 text

GETTING STARTED: TIMESTAMP DETECTION # Specify timestamp es-search.pl --base log \ --timestamp timestamp cat ~/.es-utils.yml --- base: log timestamp: timestamp @reyjrar

Slide 34

Slide 34 text

GETTING STARTED: TIMESTAMP PREFERENCES cat ~/.es-utils.yml --- base: log # Global default timestamp field timestamp: timestamp # Per base settings meta: logstash: timestamp: '@timestamp' @reyjrar

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

GETTING STARTED: SHOW ME MONEY DATA MORE LIKE LOGS # Show just selected fields es-search.pl --show hostname,program,message @reyjrar

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

GETTING STARTED: SHOW ME MATCHING DOCS # Show sshd logs es-search.pl program:sshd \ --show hostname,program,message @reyjrar

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

GETTING STARTED: MULTIPLE SEARCH PARAMETERS # Multiple parameters, AND'd es-search.pl program:sshd \ src_ip:181.206.20.11 \ --show hostname,program,message @reyjrar

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

OR MULTIPLE SEARCH PARAMETERS # Join dangling params with OR es-search.pl --or program:sshd \ src_ip:181.206.20.11 \ --show hostname,program,message # Join with OR explicitly es-search.pl program:sshd OR \ src_ip:181.206.20.11 \ --show hostname,program,message @reyjrar

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

GETTING STARTED: I WANT TO USE JQ # Make output pipe friendly to jq es-search.pl program:sshd --exists src_ip \ --jq \ | jq -r .src_ip | sort | uniq -c @reyjrar

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

EVEN MORE OPTIMIZATIONS Short cuts to save key strokes

Slide 47

Slide 47 text

QUERY STRING EXTENSIONS: BARE WORDS # and, or, not uppercased es-search.pl not program:sshd @reyjrar

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

QUERY STRING EXTENSIONS: IP # Use CIDR Notation for IPs es-search.pl src_ip:102.0.0.0/8 \ --show hostname,program,src_ip,src_geoip \ --size 1 @reyjrar

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

QUERY STRING EXTENSIONS: RANGE # Range and range combos es-search.pl dst_port:'<1024' es-search.pl status:'<500,>=400' es-search.pl crit:'>5' \ --show hostname,program,crit,name,src_ip @reyjrar

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

QUERY STRING EXTENSIONS: TERMS PROMOTION # Don't stress the Lucene escapes es-search.pl =exe:'/usr/bin/yum update -y' @reyjrar

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

QUERY STRING EXTENSIONS: PREFIX # String prefixes es-search.pl _prefix_:exe:/usr --show exe @reyjrar

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

QUERY STRING EXTENSIONS: TERMS IN A FILE # Build terms from a TSV file, last column es-search.pl src_ip:badguys.dat # Build terms from a TSV file, first column es-search.pl src_ip:badguys.dat[0] # Build terms from a CSV file, last column es-search.pl src_ip:badguys.csv @reyjrar

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

QUERY STRING EXTENSIONS: TERMS IN A JSON DATA SET # Build terms from an NDJSON file .ip es-search.pl src_ip:threatfeed.json[ip] # Build terms from an NDJSON file nested field es-search.pl src_ip:threatfeed.json[actor.ip] @reyjrar

Slide 60

Slide 60 text

CAN HAZ AGGREGATIONS i thought you'd never ask! @reyjrar

Slide 61

Slide 61 text

AGGREGATION CAVEATS ➤ Supported during "facets" and ES 0.17 ➤ Early versions of ES, up to v2.x were splodey ➤ Some limitations which I'm slowly rolling back ➤ per day ➤ Top aggregation must be a bucket ➤ Limited to 2 levels deep ➤ Well, 3 in a certain instance @reyjrar

Slide 62

Slide 62 text

AGGREGATIONS: TOP THING # Top 20 programs es-search.pl --top program # Top 50 programs es-search.pl --top program --size 50 es-search.pl --top program --limit 50 es-search.pl --top program -n 50 @reyjrar

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

AGGREGATIONS: TOP THING PER HOUR # Top programs with a src_ip ever 8 hours es-search.pl --top program _exists_:src_ip \ --interval 8h @reyjrar

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

AGGREGATIONS: TOP THING WITH ANOTHER THING # Top action with the top 3 countries es-search.pl --top action \ --with src_geoip.country # Top action with the top 10 countries es-search.pl --top action \ --with src_geoip.country:10 @reyjrar

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

AGGREGATIONS: TOP THING BY SOMETHING OTHER THAN DOC COUNT # Top src_ip by distinct dst countries es-search.pl --top src_ip \ --by cardinality:dst_geoip.country # Top dst_ip by the total traffic es-search.pl --top dst_ip \ --by sum:out_bytes @reyjrar

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

AGGREGATIONS: WHERE'S MY DATA GOING # Top src_ip by the total traffic # With top dst_ip es-search.pl --top src_ip \ --by sum:out_bytes \ --with dst_ip:1 @reyjrar

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

AGGREGATIONS: STATISTICS ANYONE? # Top program by average parse time # with a statistical summary es-search.pl --top program \ --by avg:total_time \ --with stats:total_time @reyjrar

Slide 75

Slide 75 text

No content

Slide 76

Slide 76 text

AGGREGATIONS: PERCENTILES, TOO # Top programs by average parse time # With median, 90, and 99th percentile es-search.pl --top program \ --by avg:total_time \ --with percentiles:total_time:50,90,99 @reyjrar

Slide 77

Slide 77 text

No content

Slide 78

Slide 78 text

AGGREGATIONS: I GOT YOUR HISTOGRAMS # Top 20 uri by average render time # with histogram of 100ms es-search.pl --top program \ --by avg:total_time \ --with histogram:total_time:0.01 @reyjrar

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

AGGREGATIONS: I'M ALL ABOUT SIGNIFICANCE # Top 20 significant uri for search es-search.pl --top significant_terms:uri \ render_ms:>1000 src_country:US # Top 20 significant uri for search, # Background is only US es-search.pl --top significant_terms:uri \ render_ms:>1000 src_country:US \ --bg-filter src_country:US @reyjrar

Slide 81

Slide 81 text

ONE MORE THING Well, maybe more than one more thing.. @reyjrar

Slide 82

Slide 82 text

BUILT WITH CLI::HELPERS ➤ General purpose, functional library for developing command line utilities in Perl ➤ Handles input ➤ Provides output customization including color support, 
 --color ➤ Allow output tagged as data to be redirected into a fi le, 
 --data-file=output.dat ➤ NoPaste support via App::NoPaste and --no-paste @reyjrar

Slide 83

Slide 83 text

NOTES ON APP::NOPASTE ➤ CLI::Helpers will only paste to a service fl agged as "public" if you specify --no-paste- public ➤ Subclass an App::NoPaste::Service object for your internal paste service, it's pretty simple ➤ Easily share things with colleagues directly from the command line @reyjrar

Slide 84

Slide 84 text

PUTTING SOME THINGS TOGETHER # Let's say we have a list of bad ip es-search.pl --top src_ip \ _prefix_:path:\/admin \ status:<400 \ src_ip:threatfeed.json[ip] \ --data-file=insidethehouse.dat @reyjrar

Slide 85

Slide 85 text

PUTTING SOME THINGS TOGETHER # Dump a full log of what they've done es-search.pl src_ip:insidethehouse.dat --show src_ip,src_user,uri,out_bytes \ --all # Share with your colleagues es-search.pl src_ip:insidethehouse.dat --show src_ip,src_user,uri,out_bytes \ --all --no-paste @reyjrar

Slide 86

Slide 86 text

PERL LIBRARY USAGE For those times when you want a custom fi t.. @reyjrar

Slide 87

Slide 87 text

STEAL THIS CODE @reyjrar ➤ Modules make it easy for you to interact with ES ➤ App::ElasticSearch::Utilities::QueryString provides all the fun query extenions ➤ App::ElasticSearch::Query provides a simple interface to execute queries ➤ All of these draw on the con fi g fi le and command line switches of App::ElasticSearch::Utilities

Slide 88

Slide 88 text

ILL FATED ATTEMPT AT LIVE DEMO IN 3. 2. 1.

Slide 89

Slide 89 text

FUTURE PLANS ➤ Arbitrary levels of nested aggregations ➤ JSON output for aggregations ➤ Better support for nested documents ➤ Arbitrary data joins at query time: rdns, whois, db lookups, etc. ➤ @reyjrar

Slide 90

Slide 90 text

Thank you! [email protected] https://twitter.com/reyjrar https://github.com/reyjrar https://speakerdeck.com/reyjrar https://www.craigslist.org/about/craigslist_app @reyjrar