$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Billing the Cloud
Search
Pierre-Yves Ritschard
May 12, 2017
Programming
0
310
Billing the Cloud
Updated billing the cloud slides for We are Developers 2017 in Vienna
Pierre-Yves Ritschard
May 12, 2017
Tweet
Share
More Decks by Pierre-Yves Ritschard
See All by Pierre-Yves Ritschard
Meetup Camptocamp: Exoscale SKS
pyr
0
480
The (long) road to Kubernetes
pyr
0
320
From vertical to horizontal: The challenges of scalability in the cloud
pyr
0
78
Change Management at Scale
pyr
0
120
5 years of Clojure
pyr
2
1k
Taming Jenkins
pyr
0
56
Init: then and now
pyr
1
210
From Vertical to Horizontal
pyr
2
140
Billing the Cloud
pyr
7
2.3k
Other Decks in Programming
See All in Programming
20251127_ぼっちのための懇親会対策会議
kokamoto01_metaps
2
130
AIコードレビューがチームの"文脈"を 読めるようになるまで
marutaku
0
160
なあ兄弟、 余白の意味を考えてから UI実装してくれ!
ktcryomm
10
8.5k
AIの弱点、やっぱりプログラミングは人間が(も)勉強しよう / YAPC AI and Programming
kishida
13
5.5k
ソフトウェア設計の課題・原則・実践技法
masuda220
PRO
22
19k
connect-python: convenient protobuf RPC for Python
anuraaga
0
310
関数の挙動書き換える
takatofukui
4
750
jakarta-security-jjug-ccc-2025-fall
tnagao7
0
100
TypeScriptで設計する 堅牢さとUXを両立した非同期ワークフローの実現
moeka__c
5
2.6k
All(?) About Point Sets
hole
0
230
チーム開発の “地ならし"
konifar
8
6.3k
Duke on CRaC with Jakarta EE
ivargrimstad
0
290
Featured
See All Featured
Building Flexible Design Systems
yeseniaperezcruz
329
39k
Statistics for Hackers
jakevdp
799
230k
Bash Introduction
62gerente
615
210k
Practical Orchestrator
shlominoach
190
11k
Producing Creativity
orderedlist
PRO
348
40k
Testing 201, or: Great Expectations
jmmastey
46
7.8k
Designing Experiences People Love
moore
142
24k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.2k
The Cost Of JavaScript in 2023
addyosmani
55
9.3k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.8k
Code Reviewing Like a Champion
maltzj
527
40k
Transcript
@pyr Billing the cloud Real world stream processing
@pyr Three-line bio • CTO & co-founder at Exoscale •
Open Source Developer • Monitoring & Distributed Systems Enthusiast
@pyr Billing the cloud Real world stream processing
@pyr • Billing resources • Scaling methodologies • Our approach
@pyr
@pyr provider "exoscale" { api_key = "${var.exoscale_api_key}" secret_key = "${var.exoscale_secret_key}"
} resource "exoscale_instance" "web" { template = "ubuntu 17.04" disk_size = "50g" template = "ubuntu 17.04" profile = "medium" ssh_key = "production" }
None
None
@pyr Infrastructure isn’t free! (sorry)
@pyr Business Model • Provide cloud infrastructure • (???) •
Profit!
None
None
@pyr 10000 mile high view
None
Quantities
Quantities • 10 megabytes have been set from 159.100.251.251 over
the last minute
Resources
Resources • Account WAD started instance foo with profile large
today at 12:00 • Account WAD stopped instance foo today at 12:15
A bit closer to reality {:type :usage :entity :vm :action
:create :time #inst "2016-12-12T15:48:32.000-00:00" :template "ubuntu-16.04" :source :cloudstack :account "geneva-jug" :uuid "7a070a3d-66ff-4658-ab08-fe3cecd7c70f" :version 1 :offering "medium"}
A bit closer to reality message IPMeasure { /* Versioning
*/ required uint32 header = 1; required uint32 saddr = 2; required uint64 bytes = 3; /* Validity */ required uint64 start = 4; required uint64 end = 5; }
@pyr Theory
@pyr Quantities are simple
None
@pyr Resources are harder
None
@pyr This is per account
None
@pyr Solving for all events
resources = {} metering = [] def usage_metering(): for event
in fetch_all_events(): uuid = event.uuid() time = event.time() if event.action() == 'start': resources[uuid] = time else: timespan = duration(resources[uuid], time) usage = Usage(uuid, timespan) metering.append(usage) return metering
@pyr In Practice
@pyr • This is a never-ending process • Minute-precision billing
• Applied every hour
@pyr • Avoid overbilling at all cost • Avoid underbilling
(we need to eat!)
@pyr • Keep a small operational footprint
@pyr A naive approach
30 * * * * usage-metering >/dev/null 2>&1
None
@pyr Advantages
@pyr • Low operational overhead • Simple functional boundaries •
Easy to test
@pyr Drawbacks
@pyr • High pressure on SQL server • Hard to
avoid overlapping jobs • Overlaps result in longer metering intervals
You are in a room full of overlapping cron jobs.
You can hear the screams of a dying MySQL server. An Oracle vendor is here. To the West, a door is marked “Map/Reduce” To the East, a door is marked “Stream Processing”
> Talk to Oracle
You’ve been eaten by a grue.
> Go West
@pyr
@pyr • Conceptually simple • Spreads easily • Data locality
aware processing
@pyr • ETL • High latency • High operational overhead
> Go East
@pyr
@pyr • Continuous computation on an unbounded stream • Each
record processed as it arrives • Very low latency
@pyr • Conceptually harder • Where do we store intermediate
results? • How does data flow between computation steps?
@pyr Deciding factors
@pyr Our shopping list • Operational simplicity • Integration through
our whole stack • Room to grow
@pyr Operational simplicity • Experience matters • Spark and Storm
are intimidating • Hbase & Hive discarded
@pyr Integration • HDFS & Kafka require simple integration •
Spark goes hand in hand with Cassandra
@pyr Room to grow • A ton of logs •
A ton of metrics
@pyr Small confession • Previously knew Kafka
@pyr
None
@pyr • Publish & Subscribe • Processing • Store
@pyr Publish & Subscribe • Records are produced on topics
• Topics have a predefined number of partitions • Records have a key which determines their partition
@pyr • Consumers get assigned a set of partitions •
Consumers store their last consumed offset • Brokers own partitions, handle replication
None
@pyr • Stable consumer topology • Memory disaggregation • Can
rely on in-memory storage • Age expiry and log compaction
@pyr
@pyr Billing at Exoscale
None
None
None
@pyr Problem solved?
@pyr • Process crashes • Undelivered message? • Avoiding overbilling
@pyr Reconciliation • Snapshot of full inventory • Converges stored
resource state if necessary • Handles failed deliveries as well
@pyr Avoiding overbilling • Reconciler acts as logical clock •
When supplying usage, attach a unique transaction ID • Reject multiple transaction attempts on a single ID
@pyr Avoiding overbilling • Reconciler acts as logical clock •
When supplying usage, attach a unique transaction ID • Reject multiple transaction attempts on a single ID
@pyr Parting words
@pyr Looking back • Things stay simple (roughly 600 LoC)
• Room to grow • Stable and resilient • DNS, Logs, Metrics, Event Sourcing
@pyr What about batch? • Streaming doesn’t work for everything
• Sometimes throughput matters more than latency • Building models in batch, applying with stream processing
@pyr Thanks! Questions?