Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event Logging at Belly

Event Logging at Belly

Using Fluentd to move logs into HDFS, S3, and RabbitMQ

Kevin Reedy

October 23, 2013
Tweet

More Decks by Kevin Reedy

Other Decks in Programming

Transcript

  1. The Nation’s Largest Loyalty Network “From Sea To Shining Sea”

    - literally. Tap into Belly’s nationwide network of millions of active Members and thousands of Merchant Partners. Interact with more customers better. Tens of Millions Of Visits Millions Of Members Thousands Of Locations
  2. Event Logging at Belly @kevinreedy What is Belly? •A digital

    customer loyalty platform •One card or app for the entire network •Over 2,000,000 members •Over 8,000 merchants •Over 15,000,000 visits •Unique rewards
  3. Event Logging at Belly @kevinreedy Who is this Kevin guy

    • Grew up “in Chicago” • Networks and Telecom degree from DePaul University • Infrastructure Architect on the very small systems team at Belly • At Belly for a year • Previously at Tap.Me, OpenDNS, Riverbed
  4. Event Logging at Belly @kevinreedy Event Logging - Inputs •User

    Events •Client Events •API Events •Business Events
  5. Event Logging at Belly @kevinreedy API Events • User Creation

    • Checkins • Reward Redemption • Email • Send • Delivered • Opened • Clicked • Spam Reported
  6. Event Logging at Belly @kevinreedy FluentD •Open-source tool to collect

    events and logs •Code and Plugins written in Ruby •Everything is JSON •150+ plugins, including Hadoop, S3, MongoDB, Elasticsearch •Provides Disk and RAM Buffering
  7. Event Logging at Belly @kevinreedy Bee •Data Validation and Sanitization

    Service •Written in Ruby •Rules are defined in YAML •Events from untrusted sources must first pass through Bee •This includes mobile clients
  8. Event Logging at Belly @kevinreedy API Layer •API Servers and

    Bee Servers run Fluentd locally •Fast response time •Buffering locally if there are issues •Fluentd also has a UDP input if you’d rather have best effort delivery
  9. Event Logging at Belly @kevinreedy API ➡ Aggregation Layer •

    Each API Fluentd instance forwards to two sets of servers: • Fluentd aggregators in Chicago • Fluentd aggregators in EC2 us-east-1 • Currently each aggregator set is 2 hosts • API Fluentd instances treat each aggregation set as Master/Master, but this is configurable
  10. Event Logging at Belly @kevinreedy Aggregation Layer •Limits the number

    of connections to Output Layer •Provides disk buffering for S3 and HDFS Outputs •Provides minimal buffering for RabbitMQ Outputs
  11. Event Logging at Belly @kevinreedy Aggregation ➡ Output Layer •

    Forest Output Plugin • Output Plugins for Fluentd • S3 • https://github.com/fluent/fluent-plugin-s3 • HDFS • https://github.com/fluent/fluent-plugin-webhdfs • HttpsFs instead of WebHDFS • RabbitMQ • https://github.com/restorando/fluent-plugin-amqp
  12. Event Logging at Belly @kevinreedy Applications •Ad-hoc queries in Hive

    •Hiverunner •User Facts Service •Honey Reports •Checkins Map
  13. Event Logging at Belly @kevinreedy Future Plans •Evaluate using RabbitMQ

    as the Aggregation Layer •Application Logs as an Input •Splunk or Elasticsearch / Kibana as Outputs