Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Billing the Cloud
Search
Pierre-Yves Ritschard
December 15, 2016
Technology
7
2.2k
Billing the Cloud
This talk describes how Exoscale approaches usage metering and billing with Apache Kafka
Pierre-Yves Ritschard
December 15, 2016
Tweet
Share
More Decks by Pierre-Yves Ritschard
See All by Pierre-Yves Ritschard
Meetup Camptocamp: Exoscale SKS
pyr
0
430
The (long) road to Kubernetes
pyr
0
310
From vertical to horizontal: The challenges of scalability in the cloud
pyr
0
64
Change Management at Scale
pyr
0
110
5 years of Clojure
pyr
2
1k
Taming Jenkins
pyr
0
48
Init: then and now
pyr
1
190
Billing the Cloud
pyr
0
300
From Vertical to Horizontal
pyr
2
140
Other Decks in Technology
See All in Technology
Understanding_Thread_Tuning_for_Inference_Servers_of_Deep_Models.pdf
lycorptech_jp
PRO
0
140
2025-06-26_Lightning_Talk_for_Lightning_Talks
_hashimo2
2
100
PHP開発者のためのSOLID原則再入門 #phpcon / PHP Conference Japan 2025
shogogg
4
860
データプラットフォーム技術におけるメダリオンアーキテクチャという考え方/DataPlatformWithMedallionArchitecture
smdmts
5
640
AIエージェント最前線! Amazon Bedrock、Amazon Q、そしてMCPを使いこなそう
minorun365
PRO
15
5.3k
ドメイン特化なCLIPモデルとデータセットの紹介
tattaka
0
150
生成AI時代 文字コードを学ぶ意義を見出せるか?
hrsued
1
590
Prox Industries株式会社 会社紹介資料
proxindustries
0
320
Navigation3でViewModelにデータを渡す方法
mikanichinose
0
220
Delegating the chores of authenticating users to Keycloak
ahus1
0
130
AWS アーキテクチャ作図入門/aws-architecture-diagram-101
ma2shita
30
11k
SalesforceArchitectGroupOsaka#20_CNX'25_Report
atomica7sei
0
190
Featured
See All Featured
Embracing the Ebb and Flow
colly
86
4.7k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.9k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
138
34k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Agile that works and the tools we love
rasmusluckow
329
21k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
48
2.8k
Measuring & Analyzing Core Web Vitals
bluesmoon
7
490
How to Think Like a Performance Engineer
csswizardry
24
1.7k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
130
19k
Scaling GitHub
holman
459
140k
Transcript
1 Billing the cloud Real world stream processing
2 . 1 @pyr Co-Founder, CTO at Exoscale Open source
developer
3 . 1 Tonight Problem domain Scaling methodologies Our approach
None
4 . 1
5 . 1
6 . 1 7 . 1 Infrastructure isn't free!
8 . 1 Business Model Provide cloud infrastructure ??? Pro
t!
None
9 . 1
10 . 1 11 . 1 10000 mile high view
None
12 . 1 Quantities Resources
13 . 1 14 . 1 Quantities 10 megabytes have
been sent from 159.100.251.251 over the last minute
15 . 1 Resources Account geneva-jug started instance foo with
pro le large today at 12:00 Account geneva-jug stopped instance foo today at 12:15
16 . 1 A bit closer to reality {:type :usage
:entity :vm :action :create :time #inst "2016-12-12T15:48:32.000-00:00" :template "ubuntu-16.04" :source :cloudstack :account "geneva-jug" :uuid "7a070a3d-66ff-4658-ab08-fe3cecd7c70f" :version 1 :offering "medium"}
17 . 1 A bit closer to reality message IPMeasure
{ /* Versioning */ required uint32 header = 1; required uint32 saddr = 2; required uint64 bytes = 3; /* Validity */ required uint64 start = 4; required uint64 end = 5; }
18 . 1 Theory
19 . 1 Quantities are simple
None
20 . 1 21 . 1 Resources are harder
None
22 . 1 23 . 1 This is per-account
None
24 . 1 25 . 1 Solving for all events
resources = {} metering = [] def usage_metering(): for event in fetch_all_events(): uuid = event.uuid() time = event.time() if event.action() == 'start': resources[uuid] = time else: timespan = duration(resources[uuid], time) usage = Usage(uuid, timespan) metering.append(usage) return metering
26 . 1 Practical matters This is a never-ending process
Minute precision billing Only apply once an hour Avoid over billing at all cost Avoid under billing (we need to eat!)
27 . 1 Practical matters Keep a small operational footprint
28 . 1 A naive approach
32 * * * * usage-metering >/dev/null 2>&1
29 . 1
30 . 1
31 . 1 32 . 1 Advantages
Low operational overhead Simple functional boundaries Easy to test
33 . 1 34 . 1 Drawbacks High pressure on
SQL server Hard to avoid overlapping jobs Overlaps result in longer metering intervals
You are in a room full of overlapping cron jobs.
You can hear the screams of a dying MySQL server. An Oracle vendor is here. To the West, a door is marked "Map/Reduce" To the East, a door is marked "Streaming"
35 . 1 36 . 1 > Talk to Oracle
You have been eaten by a grue.
37 . 1 38 . 1 > Go West
None
39 . 1 Conceptually simple Spreads easily Data-locality aware processing
40 . 1 ETL High latency High operational overhead
41 . 1
42 . 1 43 . 1 > Go East
None
44 . 1 Continuous computation on an unbounded stream
45 . 1 Each event processed as it comes in
Very low latency A never ending reduce
46 . 1 (reductions + [1 2 3 4]) ;;
=> (1 3 6 10)
47 . 1 Conceptually harder Where do we store intermediate
results? How does data ow between computation steps?
48 . 1
49 . 1 50 . 1 Deciding factors
51 . 1 Our shopping list
Operational simplicity Integration through our whole stack Going beyond billing
Room to grow
52 . 1 53 . 1 Operational simplicity Experience matters
Spark and Storm are intimidating Hbase & Hive discarded
54 . 1 Integration HDFS would require simple integration Spark
usually goes hand in hand with Cassandra Storm tends to prefer Kafka
55 . 1 Room to grow A ton of logs
A ton of metrics
56 . 1 Thursday confessions Previously knew Kafka
None
57 . 1
58 . 1 Publish & Subscribe Processing Store
59 . 1 60 . 1 Publish & Subscribe Messages
are produced to topics Topics have a prede ned number of partitions Messages have a key which determines its partition
Consumers get assigned a set of partitions Consumers store their
last consumed offset Brokers own partitions, handle replication
61 . 1
62 . 1 Stable consumer topology Memory desaggregation Can rely
on in-memory storage
63 . 1 64 . 1 Stream expiry
None
65 . 1
66 . 1
67 . 1
68 . 1 69 . 1 Problem solved?
Process crashes Undelivered message? Avoiding double billing
70 . 1 71 . 1 Process crashes Triggers a
rebalance Loss of in-memory cache No initial state!
72 . 1 Reconciliation Snapshot of full inventory Converges stored
resource state if necessary Handles failed deliveries as well
73 . 1 Avoiding double billing Reconciler acts as logical
clock When supplying usage, attach a unique transaction ID Reject multiple transaction attempts on a single ID
74 . 1 Looking back Things stay simple (roughly 600
LoC) Room to grow Stable and resilient DNS, Logs, Metrics, Event Sourcing
75 . 1 What about batch Streaming doesn't work for
everything Sometimes throughput matters more than latency Building models in batch, applying with stream processing
76 . 1 Questions? Thanks!