Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Billing the Cloud
Search
Pierre-Yves Ritschard
December 15, 2016
Technology
7
2.2k
Billing the Cloud
This talk describes how Exoscale approaches usage metering and billing with Apache Kafka
Pierre-Yves Ritschard
December 15, 2016
Tweet
Share
More Decks by Pierre-Yves Ritschard
See All by Pierre-Yves Ritschard
Meetup Camptocamp: Exoscale SKS
pyr
0
410
The (long) road to Kubernetes
pyr
0
300
From vertical to horizontal: The challenges of scalability in the cloud
pyr
0
59
Change Management at Scale
pyr
0
96
5 years of Clojure
pyr
2
1k
Taming Jenkins
pyr
0
38
Init: then and now
pyr
1
180
Billing the Cloud
pyr
0
290
From Vertical to Horizontal
pyr
2
140
Other Decks in Technology
See All in Technology
GitHub MCP Serverを使って Pull Requestを作る、レビューする
hiyokose
2
570
Restarting_SRE_Road_to_SRENext_.pdf
_awache
1
220
Tokyo dbt Meetup #13 dbtと連携するBI製品&機能ざっくり紹介
sagara
0
320
20250325_Logic Apps / Power Automate の SharePoint コネクタの裏側を知る 〜Graph APIで直接操作してみよう〜
yutakaosada
0
110
職種に名前が付く、ということ/The fact that a job title has a name
bitkey
1
270
17年のQA経験が導いたスクラムマスターへの道 / 17 Years in QA to Scrum Master
toma_sm
0
510
FinOps_Demo
tkhresk
0
110
モンテカルロ木探索のパフォーマンスを予測する Kaggleコンペ解説 〜生成AIによる未知のゲーム生成〜
rist
4
1.3k
SaaSプロダクト開発におけるバグの早期検出のためのAcceptance testの取り組み
kworkdev
PRO
0
540
生成AI時代のセキュアCI/CDとソース管理
yuriemori
0
110
ウォンテッドリーにおける Platform Engineering
bgpat
0
170
Zabbixチョットデキルとは!?
kujiraitakahiro
0
130
Featured
See All Featured
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
22
2.6k
Optimizing for Happiness
mojombo
377
70k
Designing Experiences People Love
moore
141
23k
The Cost Of JavaScript in 2023
addyosmani
48
7.6k
It's Worth the Effort
3n
184
28k
StorybookのUI Testing Handbookを読んだ
zakiyama
28
5.6k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
7
630
Java REST API Framework Comparison - PWX 2021
mraible
29
8.5k
Making Projects Easy
brettharned
116
6.1k
Build The Right Thing And Hit Your Dates
maggiecrowley
34
2.6k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.7k
Docker and Python
trallard
44
3.3k
Transcript
1 Billing the cloud Real world stream processing
2 . 1 @pyr Co-Founder, CTO at Exoscale Open source
developer
3 . 1 Tonight Problem domain Scaling methodologies Our approach
None
4 . 1
5 . 1
6 . 1 7 . 1 Infrastructure isn't free!
8 . 1 Business Model Provide cloud infrastructure ??? Pro
t!
None
9 . 1
10 . 1 11 . 1 10000 mile high view
None
12 . 1 Quantities Resources
13 . 1 14 . 1 Quantities 10 megabytes have
been sent from 159.100.251.251 over the last minute
15 . 1 Resources Account geneva-jug started instance foo with
pro le large today at 12:00 Account geneva-jug stopped instance foo today at 12:15
16 . 1 A bit closer to reality {:type :usage
:entity :vm :action :create :time #inst "2016-12-12T15:48:32.000-00:00" :template "ubuntu-16.04" :source :cloudstack :account "geneva-jug" :uuid "7a070a3d-66ff-4658-ab08-fe3cecd7c70f" :version 1 :offering "medium"}
17 . 1 A bit closer to reality message IPMeasure
{ /* Versioning */ required uint32 header = 1; required uint32 saddr = 2; required uint64 bytes = 3; /* Validity */ required uint64 start = 4; required uint64 end = 5; }
18 . 1 Theory
19 . 1 Quantities are simple
None
20 . 1 21 . 1 Resources are harder
None
22 . 1 23 . 1 This is per-account
None
24 . 1 25 . 1 Solving for all events
resources = {} metering = [] def usage_metering(): for event in fetch_all_events(): uuid = event.uuid() time = event.time() if event.action() == 'start': resources[uuid] = time else: timespan = duration(resources[uuid], time) usage = Usage(uuid, timespan) metering.append(usage) return metering
26 . 1 Practical matters This is a never-ending process
Minute precision billing Only apply once an hour Avoid over billing at all cost Avoid under billing (we need to eat!)
27 . 1 Practical matters Keep a small operational footprint
28 . 1 A naive approach
32 * * * * usage-metering >/dev/null 2>&1
29 . 1
30 . 1
31 . 1 32 . 1 Advantages
Low operational overhead Simple functional boundaries Easy to test
33 . 1 34 . 1 Drawbacks High pressure on
SQL server Hard to avoid overlapping jobs Overlaps result in longer metering intervals
You are in a room full of overlapping cron jobs.
You can hear the screams of a dying MySQL server. An Oracle vendor is here. To the West, a door is marked "Map/Reduce" To the East, a door is marked "Streaming"
35 . 1 36 . 1 > Talk to Oracle
You have been eaten by a grue.
37 . 1 38 . 1 > Go West
None
39 . 1 Conceptually simple Spreads easily Data-locality aware processing
40 . 1 ETL High latency High operational overhead
41 . 1
42 . 1 43 . 1 > Go East
None
44 . 1 Continuous computation on an unbounded stream
45 . 1 Each event processed as it comes in
Very low latency A never ending reduce
46 . 1 (reductions + [1 2 3 4]) ;;
=> (1 3 6 10)
47 . 1 Conceptually harder Where do we store intermediate
results? How does data ow between computation steps?
48 . 1
49 . 1 50 . 1 Deciding factors
51 . 1 Our shopping list
Operational simplicity Integration through our whole stack Going beyond billing
Room to grow
52 . 1 53 . 1 Operational simplicity Experience matters
Spark and Storm are intimidating Hbase & Hive discarded
54 . 1 Integration HDFS would require simple integration Spark
usually goes hand in hand with Cassandra Storm tends to prefer Kafka
55 . 1 Room to grow A ton of logs
A ton of metrics
56 . 1 Thursday confessions Previously knew Kafka
None
57 . 1
58 . 1 Publish & Subscribe Processing Store
59 . 1 60 . 1 Publish & Subscribe Messages
are produced to topics Topics have a prede ned number of partitions Messages have a key which determines its partition
Consumers get assigned a set of partitions Consumers store their
last consumed offset Brokers own partitions, handle replication
61 . 1
62 . 1 Stable consumer topology Memory desaggregation Can rely
on in-memory storage
63 . 1 64 . 1 Stream expiry
None
65 . 1
66 . 1
67 . 1
68 . 1 69 . 1 Problem solved?
Process crashes Undelivered message? Avoiding double billing
70 . 1 71 . 1 Process crashes Triggers a
rebalance Loss of in-memory cache No initial state!
72 . 1 Reconciliation Snapshot of full inventory Converges stored
resource state if necessary Handles failed deliveries as well
73 . 1 Avoiding double billing Reconciler acts as logical
clock When supplying usage, attach a unique transaction ID Reject multiple transaction attempts on a single ID
74 . 1 Looking back Things stay simple (roughly 600
LoC) Room to grow Stable and resilient DNS, Logs, Metrics, Event Sourcing
75 . 1 What about batch Streaming doesn't work for
everything Sometimes throughput matters more than latency Building models in batch, applying with stream processing
76 . 1 Questions? Thanks!