Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event Logging at Belly

Event Logging at Belly

Using Fluentd to move logs into HDFS, S3, and RabbitMQ

Kevin Reedy

October 23, 2013
Tweet

More Decks by Kevin Reedy

Other Decks in Programming

Transcript

  1. Kevin Reedy
    @kevinreedy
    [email protected]
    http://tech.bellycard.com
    Event Logging at Belly
    !
    Using fluentd to move logs into HDFS,
    S3, and RabbitMQ

    View Slide

  2. Event Logging at Belly @kevinreedy
    What is Belly?

    View Slide

  3. Event Logging at Belly @kevinreedy

    View Slide

  4. The Nation’s Largest Loyalty Network
    “From Sea To Shining Sea” - literally.
    Tap into Belly’s nationwide network of millions of active Members and thousands of Merchant Partners.
    Interact with more customers better.
    Tens of Millions
    Of Visits
    Millions
    Of Members
    Thousands
    Of Locations

    View Slide

  5. Event Logging at Belly @kevinreedy
    What is Belly?
    •A digital customer loyalty platform
    •One card or app for the entire network
    •Over 2,000,000 members
    •Over 8,000 merchants
    •Over 15,000,000 visits
    •Unique rewards

    View Slide

  6. Event Logging at Belly @kevinreedy
    Who is this Kevin guy
    • Grew up “in Chicago”
    • Networks and Telecom degree from DePaul
    University
    • Infrastructure Architect on the very small systems
    team at Belly
    • At Belly for a year
    • Previously at Tap.Me, OpenDNS, Riverbed

    View Slide

  7. Event Logging at Belly @kevinreedy
    Event Logging
    At Belly

    View Slide

  8. Event Logging at Belly @kevinreedy
    Event Logging - Inputs
    •User Events
    •Client Events
    •API Events
    •Business Events

    View Slide

  9. Event Logging at Belly @kevinreedy
    User Events
    •Mobile Applications
    •In-Store Tablet Application
    •Homepage

    View Slide

  10. Event Logging at Belly @kevinreedy
    Client Events
    •In-store tablet
    •Heartbeats
    •Error Logs

    View Slide

  11. Event Logging at Belly @kevinreedy
    API Events
    • User Creation
    • Checkins
    • Reward Redemption
    • Email
    • Send
    • Delivered
    • Opened
    • Clicked
    • Spam Reported

    View Slide

  12. Event Logging at Belly @kevinreedy
    Business Events
    •Salesforce Callbacks
    •Meetings
    •Installs
    •Closes

    View Slide

  13. Event Logging at Belly @kevinreedy
    Event Logging - Outputs
    •Hadoop HDFS
    •S3
    •RabbitMQ

    View Slide

  14. Event Logging at Belly @kevinreedy
    FluentD
    •Open-source tool to collect events and
    logs
    •Code and Plugins written in Ruby
    •Everything is JSON
    •150+ plugins, including Hadoop, S3,
    MongoDB, Elasticsearch
    •Provides Disk and RAM Buffering

    View Slide

  15. Event Logging at Belly @kevinreedy
    Bee
    •Data Validation and Sanitization
    Service
    •Written in Ruby
    •Rules are defined in YAML
    •Events from untrusted sources must
    first pass through Bee
    •This includes mobile clients

    View Slide

  16. Event Logging at Belly @kevinreedy
    Bee

    View Slide

  17. Event Logging at Belly @kevinreedy
    Putting it all together

    View Slide

  18. Event Logging at Belly @kevinreedy
    API Layer
    •API Servers and Bee Servers run
    Fluentd locally
    •Fast response time
    •Buffering locally if there are issues
    •Fluentd also has a UDP input if you’d
    rather have best effort delivery

    View Slide

  19. Event Logging at Belly @kevinreedy
    API ➡ Aggregation Layer
    • Each API Fluentd instance forwards to two sets of
    servers:
    • Fluentd aggregators in Chicago
    • Fluentd aggregators in EC2 us-east-1
    • Currently each aggregator set is 2 hosts
    • API Fluentd instances treat each aggregation set as
    Master/Master, but this is configurable

    View Slide

  20. Event Logging at Belly @kevinreedy
    Aggregation Layer
    •Limits the number of connections to
    Output Layer
    •Provides disk buffering for S3 and HDFS
    Outputs
    •Provides minimal buffering for
    RabbitMQ Outputs

    View Slide

  21. Event Logging at Belly @kevinreedy
    Aggregation ➡ Output Layer
    • Forest Output Plugin
    • Output Plugins for Fluentd
    • S3
    • https://github.com/fluent/fluent-plugin-s3
    • HDFS
    • https://github.com/fluent/fluent-plugin-webhdfs
    • HttpsFs instead of WebHDFS
    • RabbitMQ
    • https://github.com/restorando/fluent-plugin-amqp

    View Slide

  22. Event Logging at Belly @kevinreedy
    Applications
    •Ad-hoc queries in Hive
    •Hiverunner
    •User Facts Service
    •Honey Reports
    •Checkins Map

    View Slide

  23. Event Logging at Belly @kevinreedy
    Honey Reports

    View Slide

  24. Event Logging at Belly @kevinreedy
    Honey Reports

    View Slide

  25. Event Logging at Belly @kevinreedy
    Checkins Map

    View Slide

  26. Event Logging at Belly @kevinreedy
    Future Plans
    •Evaluate using RabbitMQ as the
    Aggregation Layer
    •Application Logs as an Input
    •Splunk or Elasticsearch / Kibana as
    Outputs

    View Slide

  27. Event Logging at Belly @kevinreedy
    Questions?
    !
    !
    …if there is time

    View Slide

  28. Event Logging at Belly @kevinreedy
    Keep in touch
    •twitter: @kevinreedy
    •email: [email protected]
    •internets: http://tech.bellycard.com

    View Slide