$30 off During Our Annual Pro Sale. View Details »

DevOps Data Storage

Brad Lhotsky
February 21, 2017

DevOps Data Storage

What data do we need to store to be more devopsy? How can we use the data? What data stores are out there, where and why might we use them?

Brad Lhotsky

February 21, 2017
Tweet

More Decks by Brad Lhotsky

Other Decks in Technology

Transcript

  1. DevOps Data Storage
    Presented by Brad Lhotsky

    View Slide

  2. Brad Lhotsky
    • Systems and Security

    at Craigslist
    • Infrastructure Monitoring & Security

    at Booking.com
    • Recovering
    • Perl Programmer
    • Linux/BSD Systems Admin
    • Network Security Specialist
    • PostgreSQL Administrator
    • ElasticSearch Janitor
    • DNS Voyeur
    • OSSEC Core Team Member
    https://github.com/reyjrar
    https://twitter.com/reyjrar

    View Slide

  3. Expectations
    ‣Common Data Types
    ‣Using your Data
    ‣Features of Data Stores
    ‣Popular Data Stores

    View Slide

  4. Types of DevOps-y
    Data

    View Slide

  5. Administrative
    & Meta-Data
    ‣ Inventory
    ‣ Hardware
    ‣ Software
    ‣ Builds or Roles
    ‣ Services
    ‣ Users and Groups
    ‣ Employee/Contractor
    ‣ Managers
    ‣ ACLs

    View Slide

  6. Monitoring
    ‣ State
    ‣ OK / NOT OK
    ‣ UP / DOWN
    ‣ Package Version
    ‣ Time Series
    ‣ Counter
    ‣ Rate
    ‣ Statistical Summaries

    View Slide

  7. Events
    ‣ State Changes
    ‣ Package Updated
    ‣ Service Stopped
    ‣ System Events
    ‣ Syslog Message
    ‣ SNMP Traps
    ‣ Application Events
    ‣ Access
    ‣ Errors
    ‣ Traces

    View Slide

  8. Deving and Oping
    Your Data

    View Slide

  9. Monitoring and Metrics

    View Slide

  10. -Nicole Forsgren - Monitorama PDX 2016
    How Metrics Shape Your Culture
    “Metrics are your culture.”

    View Slide

  11. "How Metrics Shape Your Culture"
    • You can't improve what you don't measure
    • Always measure things that matter
    • Things measured are things managed
    • Metrics can be gamed
    • Metrics inform incenticves
    • Not everything that can be counted counts
    • Hard to measure doesn't mean it isn't worth measuring

    View Slide

  12. ... and that's probably O.K.
    All of your monitoring is
    probably wrong.

    View Slide

  13. Alerting
    ‣ Disrupting People's Lives at 95% Disk Full
    ‣ Thresholds -> Change Detection
    ‣ State Change Thresholds

    View Slide

  14. Automation
    ‣ Can a Machine read my data?
    ‣ Autoscale
    ‣ Trend detection
    ‣ Service Level Roll Ups in Alerting
    ‣ Reporting

    View Slide

  15. Capacity Planning
    ‣ Predicting System Stress Levels
    ‣ Make Intelligent Projections
    ‣ Test those predictions

    View Slide

  16. Exploration

    View Slide

  17. Attractive Features

    View Slide

  18. Open and Extensible
    ‣ Integrations with other projects
    ‣ Open API
    ‣ Good Documentation
    ‣ Modular / Plugin Structute
    ‣ Community

    View Slide

  19. Reliability vs.
    Performance

    View Slide

  20. Retention
    ‣ How easy is it to age data off?
    ‣ What regulations of laws apply to the data?
    ‣ Expectations from:
    ‣ Customers
    ‣ Employees
    ‣ Managers
    ‣ Peers
    ‣ Legal

    View Slide

  21. Privacy and Security
    ‣ What do you keep on your users in your ops data?
    ‣ Who might come calling for it?
    ‣ How comfortable are you handing it over to Trump?
    ‣ Anyone hear about MongoDB?
    ‣ Can the store provide security?

    View Slide

  22. Places People Stick
    DevOps Datas

    View Slide

  23. ‣ Large Community
    ‣ Forks and Oracle
    ‣ Performance First
    ‣ SQL Interface
    ‣ Limited Data Types
    ‣ Web > BI
    ‣ Suitable for Administrative Data

    View Slide

  24. ‣ Large Community
    ‣ Reliability First
    ‣ Open and Extensible
    ‣ PGXN
    ‣ CitusData
    ‣ GreenPlum
    ‣ EnterpriseDB
    ‣ Native Support for IP Addresses
    ‣ Extensible Data Types
    ‣ Suitable for Administrative Data

    View Slide

  25. ‣ Large Community
    ‣ Interchangeable Components, ala, MicroServices
    ‣ Simple API
    ‣ Rampant Open Source Adoption
    ‣ Scalable
    ‣ Compatibility
    ‣ Grafana, Statsd, Riemann, Bosun, Cabot, Seyren
    ‣ etc., etc.,
    ‣ Suitable for Time Series Data
    ‣ Smallest Resolution: seconds

    View Slide

  26. security.logging.indexer.*.total
    Metrics: Wildcards

    View Slide

  27. sumSeries(security.logging.indexer.*.total)
    Combining Metrics

    View Slide

  28. alias(sumSeries(security.logging.indexer.*.total),”Today")
    alias(
    timeShift(
    sumSeries(security.logging.indexer.*.total),
    “7d"),
    "Last Week")
    Comparing Metrics

    View Slide

  29. alias(alpha(color(areaBetween(
    holtWintersConfidenceBands(
    maxSeries(general.es.*.jvm.mem.heap_used_bytes)
    )
    ),“gray"),0.1),"Hot Winter Confidence Bands”)
    color(alias(
    maxSeries(general.es.*.jvm.mem.heap_used_bytes),
    "Max Heap Size"),"red")
    Advanced Tricks

    View Slide

  30. ‣ Metrics 2.0
    ‣ Hadoop / Hbase backed
    ‣ SQL-like Language
    ‣ Zero Data Loss
    ‣ Compatibility
    ‣ Carbon, Grafana, Statsd, Riemann, Bosun
    ‣ Suitable for Time Series Data
    ‣ Smallest Resolution: milliseconds

    View Slide

  31. ‣ Metrics 2.0
    ‣ SQL-like Language
    ‣ Zero Data Loss
    ‣ Compatibility
    ‣ Carbon, Grafana, Statsd, Riemann, Bosun
    ‣ Suitable for Time Series Data
    ‣ Smallest Resolution: nanoseconds

    View Slide

  32. ‣ Well Documented API
    ‣ Many Open Source Integrations
    ‣ Lucene backed text search
    ‣ Scalable
    ‣ "Jepsen ElasticSearch" re:CAP

    View Slide

  33. Web Attacks Scanners

    View Slide

  34. Slow Pages

    View Slide

  35. App::ElasticSearch::Utilities
    Search Stuff!
    = Querying Indexes: lhr4-access-2015.06.03,ams4-access-2015.06.03
    @timestamp src_ip src_ip_country file
    2015-06-03T11:39:27+0200 217.36.201.217 GB /B1D671CF-
    E532-4481-99AA-19F420D90332/netdefender/hui/ndhui.css
    2015-06-03T11:39:26+0200 92.56.217.84 ES /hotel/es/
    null.es.html
    # Search Parameters:
    # {"query_string":{"query":"dst:www.booking.com AND crit:404"}}
    # Displaying 3 of (CENSORED) in 0 seconds.
    # Indexes (2 of 4) searched: ams4-access-2015.06.03,lhr4-access-2015.06.03
    https://github.com/reyjrar/es-utils
    $ es-search.pl --base access dst:www.booking.com and crit:404 \
    --show src_ip,src_ip_country,file --size 2

    View Slide

  36. Aggregate Stuff
    https://github.com/reyjrar/es-utils
    $ es-search.pl --base access --days 1 dst:www.booking.com \
    --top src_ip --size 3
    = Querying Indexes: ams4-access-2015.06.03,lhr4-access-2015.06.03
    count src_ip
    (CENSORED) 66.249.92.71
    (CENSORED) 66.249.92.59
    (CENSORED) 66.249.92.65
    # Search Parameters:
    # {"query_string":{"query":"dst:www.booking.com"}}
    # Displaying 3 of (CENSORED) in 5 seconds.
    # Indexes (2 of 2) searched: ams4-access-2015.06.03,lhr4-access-2015.06.03
    #
    # Totals across batch
    #
    count src_ip
    (CENSORED) 66.249.92.71
    (CENSORED) 66.249.92.59
    (CENSORED) 66.249.92.65

    View Slide

  37. Find Pages Viewed by Most Countries
    https://github.com/reyjrar/es-utils
    $ es-search.pl --base access --days 1 dst:www.booking.com \
    --top file --by cardinality:src_ip_country --size 3
    = Querying Indexes: lhr4-access-2015.06.03,ams4-access-2015.06.03
    cardinality:src_ip_country count file
    239 (CENSORED) /
    236 (CENSORED) /rt_data/city_bookings
    234 (CENSORED) /wishlist/get
    # Search Parameters:
    # {"query_string":{"query":"dst:www.booking.com"}}
    # Displaying 3 of (CENSORED) in 21 seconds.
    # Indexes (2 of 2) searched: ams4-access-2015.06.03,lhr4-access-2015.06.03

    View Slide

  38. Pipeline Queries
    https://github.com/reyjrar/es-utils
    $ es-search.pl --base access --days 1 dst:www.booking.com \
    --top src_ip --by sum:attack_score --size 3 \
    --data-file top_attackers.dat
    $ es-search.pl --base access --days 1 dst:www.booking.com \
    src_ip:top_attackers.dat[-1] --size 3\
    --show attack_score,src_ip,crit,dst,method,resource \
    --sort attack_score:desc

    View Slide

  39. Pipeline Queries
    https://github.com/reyjrar/es-utils
    = Querying Indexes: lhr4-access-2015.06.03,ams4-access-2015.06.03
    @timestamp attack_score src_ip crit dst method resource
    2015-06-03T04:20:59+0200 340 107.150.42.90 404 www.booking.com GET /plus/
    search.php?keyword=as&typeArr[111%3D@`%5C'`)+/*!50000And*/+(/*!50000SeLECT*/+1+/*!50000frOM*/+(/*!
    50000SeLECT*/+/*!50000Count(*)*/,concat(floor(rand(0)*2),(substring((/*!50000SeLECT*/
    +CONCAT(0x40,userid,0x7c,substring(pwd,4,16))+from+`%23@__admin`+limit+0,1),1,62)))a+/*!
    50000fRom*/+information_schema.tables+/*!50000gROUP*/+by+a)b)%23@`%5C'`+]=a
    2015-06-03T00:50:43+0200 340 107.150.42.90 404 www.booking.com GET /plus/
    search.php?keyword=as&typeArr[111%3D@`%5C'`)+/*!50000And*/+(/*!50000SeLECT*/+1+/*!50000frOM*/+(/*!
    50000SeLECT*/+/*!50000Count(*)*/,concat(floor(rand(0)*2),(substring((/*!50000SeLECT*/
    +CONCAT(0x40,userid,0x7c,substring(pwd,4,16))+from+`%23@__admin`+limit+0,1),1,62)))a+/*!
    50000fRom*/+information_schema.tables+/*!50000gROUP*/+by+a)b)%23@`%5C'`+]=a
    2015-06-03T05:18:19+0200 340 107.150.42.90 404 www.booking.com GET /plus/
    search.php?keyword=as&typeArr[111%3D@`%5C'`)+/*!50000And*/+(/*!50000SeLECT*/+1+/*!50000frOM*/+(/*!
    50000SeLECT*/+/*!50000Count(*)*/,concat(floor(rand(0)*2),(substring((/*!50000SeLECT*/
    +CONCAT(0x40,userid,0x7c,substring(pwd,4,16))+from+`%23@__admin`+limit+0,1),1,62)))a+/*!
    50000fRom*/+information_schema.tables+/*!50000gROUP*/+by+a)b)%23@`%5C'`+]=a
    # Search Parameters:
    # {"terms":{"src_ip":["107.150.42.90","37.59.7.157","74.84.138.120"]}}
    # {"query_string":{"query":"dst:www.booking.com"}}
    # Displaying 3 of 793 in 0 seconds.
    # Indexes (2 of 2) searched: ams4-access-2015.06.03,lhr4-access-2015.06.03

    View Slide

  40. Recap

    View Slide

  41. Data Types
    Meta-Data State Time Series Events
    Graphite No No Yes Kinda
    InfluxDB Kinda Yes Yes Kinda
    OpenTSDB Kinda Yes Yes Kinda
    MySQL Yes Yes No Kinda
    PostgreSQL Yes Yes No Kinda
    ElasticSearch No Kinda Kinda Yes

    View Slide

  42. Features of Your Data
    Interval Cardinality Data Type Aging
    Graphite
    Fixed,
    Regular
    Low Numeric Roll up
    InfluxDB
    Fixed (best)
    Any
    High Any Configurable
    OpenTSDB Any High Numeric n/a
    MySQL Any
    Keys: Low
    Values: High
    Structured* None
    PostgreSQL Any
    Keys: Low
    Values: High
    Structured* None
    ElasticSearch Any
    Keys: Low
    Values: High
    Any None

    View Slide

  43. Features of the Store
    Security Scalability Performance Reliability
    Graphite Low High High* Medium
    InfluxDB Low High Medium High
    OpenTSDB Low High Low High
    MySQL Medium Medium Medium* High*
    PostgreSQL High Medium Medium* High
    ElasticSearch Low High High Low

    View Slide

  44. Thank you!
    [email protected]
    https://twitter.com/reyjrar
    https://github.com/reyjrar
    https://speakerdeck.com/reyjrar
    https://www.craigslist.org/about/craigslist_is_hiring

    View Slide