Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Billing the Cloud

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Billing the Cloud

This talk describes how Exoscale approaches usage metering and billing with Apache Kafka

Avatar for Pierre-Yves Ritschard

Pierre-Yves Ritschard

December 15, 2016

More Decks by Pierre-Yves Ritschard

Other Decks in Technology

Transcript

  1. 13 . 1 14 . 1 Quantities 10 megabytes have

    been sent from 159.100.251.251 over the last minute
  2. 15 . 1 Resources Account geneva-jug started instance foo with

    pro le large today at 12:00 Account geneva-jug stopped instance foo today at 12:15
  3. 16 . 1 A bit closer to reality {:type :usage

    :entity :vm :action :create :time #inst "2016-12-12T15:48:32.000-00:00" :template "ubuntu-16.04" :source :cloudstack :account "geneva-jug" :uuid "7a070a3d-66ff-4658-ab08-fe3cecd7c70f" :version 1 :offering "medium"}
  4. 17 . 1 A bit closer to reality message IPMeasure

    { /* Versioning */ required uint32 header = 1; required uint32 saddr = 2; required uint64 bytes = 3; /* Validity */ required uint64 start = 4; required uint64 end = 5; }
  5. 24 . 1 25 . 1 Solving for all events

    resources = {} metering = [] def usage_metering(): for event in fetch_all_events(): uuid = event.uuid() time = event.time() if event.action() == 'start': resources[uuid] = time else: timespan = duration(resources[uuid], time) usage = Usage(uuid, timespan) metering.append(usage) return metering
  6. 26 . 1 Practical matters This is a never-ending process

    Minute precision billing Only apply once an hour Avoid over billing at all cost Avoid under billing (we need to eat!)
  7. 33 . 1 34 . 1 Drawbacks High pressure on

    SQL server Hard to avoid overlapping jobs Overlaps result in longer metering intervals
  8. You are in a room full of overlapping cron jobs.

    You can hear the screams of a dying MySQL server. An Oracle vendor is here. To the West, a door is marked "Map/Reduce" To the East, a door is marked "Streaming"
  9. 45 . 1 Each event processed as it comes in

    Very low latency A never ending reduce
  10. 47 . 1 Conceptually harder Where do we store intermediate

    results? How does data ow between computation steps?
  11. 52 . 1 53 . 1 Operational simplicity Experience matters

    Spark and Storm are intimidating Hbase & Hive discarded
  12. 54 . 1 Integration HDFS would require simple integration Spark

    usually goes hand in hand with Cassandra Storm tends to prefer Kafka
  13. 59 . 1 60 . 1 Publish & Subscribe Messages

    are produced to topics Topics have a prede ned number of partitions Messages have a key which determines its partition
  14. Consumers get assigned a set of partitions Consumers store their

    last consumed offset Brokers own partitions, handle replication
  15. 70 . 1 71 . 1 Process crashes Triggers a

    rebalance Loss of in-memory cache No initial state!
  16. 72 . 1 Reconciliation Snapshot of full inventory Converges stored

    resource state if necessary Handles failed deliveries as well
  17. 73 . 1 Avoiding double billing Reconciler acts as logical

    clock When supplying usage, attach a unique transaction ID Reject multiple transaction attempts on a single ID
  18. 74 . 1 Looking back Things stay simple (roughly 600

    LoC) Room to grow Stable and resilient DNS, Logs, Metrics, Event Sourcing
  19. 75 . 1 What about batch Streaming doesn't work for

    everything Sometimes throughput matters more than latency Building models in batch, applying with stream processing