Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cassandra for Pythonistas
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Sébastien Béal
September 14, 2013
Programming
1
82
Cassandra for Pythonistas
Talk given at PyCon APAC 2013 on Cassandra drivers for Python with a focus on cassandra-driver.
Sébastien Béal
September 14, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
余白を設計しフロントエンド開発を 加速させる
tsukuha
7
2.1k
Fragmented Architectures
denyspoltorak
0
150
KIKI_MBSD Cybersecurity Challenges 2025
ikema
0
1.3k
組織で育むオブザーバビリティ
ryota_hnk
0
170
「ブロックテーマでは再現できない」は本当か?
inc2734
0
520
Kotlin Multiplatform Meetup - Compose Multiplatform 외부 의존성 아키텍처 설계부터 운영까지
wisemuji
0
190
SourceGeneratorのススメ
htkym
0
190
副作用をどこに置くか問題:オブジェクト指向で整理する設計判断ツリー
koxya
1
590
CSC307 Lecture 08
javiergs
PRO
0
670
コマンドとリード間の連携に対する脅威分析フレームワーク
pandayumi
1
450
AIエージェント、”どう作るか”で差は出るか? / AI Agents: Does the "How" Make a Difference?
rkaga
4
2k
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
680
Featured
See All Featured
SEO for Brand Visibility & Recognition
aleyda
0
4.2k
GraphQLの誤解/rethinking-graphql
sonatard
74
11k
Documentation Writing (for coders)
carmenintech
77
5.2k
Navigating Team Friction
lara
192
16k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Side Projects
sachag
455
43k
Claude Code のすすめ
schroneko
67
210k
How to Ace a Technical Interview
jacobian
281
24k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Visualization
eitanlees
150
17k
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
410
A designer walks into a library…
pauljervisheath
210
24k
Transcript
Cassandra for Pythonistas Sébastien Béal PyCon APAC 2013 09/14/2013
Cassandra for Pythonistas Humans ...or not
whoami locarise sebastibe @gmail.com CEO and Co-Founder @ in Tokyo
2009-2012
Why Cassandra? Connect all the things!
Distributed column-based key- value store (schema optional) Released 2.0 on
September 3rd BigTable Dynamo 2009
Architecture Cluster Node Seed Seed Ring Gossip Snitch
Other Features • Partitioner • Data replication: ‣ Simple Strategy
(1 datacenter) ‣ Network Topology Strategy • Compaction
Data Model keyspace column family column family row row row
column column column row row row row super column super column super column super column column column column column
Data Model column family = {row key: {column name: value}
} column family = {row key: {super column name: {column name: value} } }
Composite column family = {(key1, key2): {(name1, name2): value} }
composite key composite column name
Communication • Thrift • Cassandra Query Language (CQL) • CQL
2 • CQL 3 (Cassandra 1.2.x) • CQL 3.1 (Cassandra 2.0+)
None
Cassandra & Python
Python Packages • Pycassa (Thrift) • Telephus (Thrift, twisted) •
Silverberg (CQL, twisted) • cassandra-dbapi2 (CQL, PEP249) • cassandra-driver (CQL3, libev)
Python 3 http://python3wos.appspot.com/
cassandra driver • Released in August 2013 • Designed for
CQL • Replacement for Pycassa Still in Beta!
CQL • “Denormalized SQL” ‣ No joins ‣ No sub-queries
‣ No aggregation ‣ Limited ORDER BY
Keyspace from cassandra.cluster import Cluster cluster = Cluster() session =
cluster.connect() session.execute("CREATE KEYSPACE Keyspace WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1};") session.set_keyspace("Keyspace")
Column Family session.execute("CREATE TABLE users (" "username varchar," "gender varchar,"
"session_token varchar," "birth_year bigint," "PRIMARY KEY (user_name));")
Prepared Statement query = "INSERT INTO users (username, gender, birth_year)
VALUES (?, ?, ?)" prepared = session.prepare(query) session.execute(prepared.bind(('seb', 'M', 1984)))
Prepared Statement from cassandra.query import ValueSequence users = ('alice', 'bob',
'seb') query = "SELECT * FROM users WHERE user_id IN ?" session.execute(query, parameters=[ValueSequence(users)])
Decoder session.execute("SELECT * FROM users") # [Row(username=u'seb', birth_year=1984, gender=u'M', session_token=None)]
from cassandra.decoder import ordered_dict_factory session.row_factory = ordered_dict_factory session.execute("SELECT * FROM users") # [OrderedDict([(u'user_name', u'seb'), ( u'birth_year', 1984), (u'gender', u'M'), (u'session_token', None)])]
Async Calls future = session.execute_async("SELECT * FROM users") def print_results(results):
for row in results: print "Results: %s" % row def print_error(exc): print "Operation failed: %s" % exc future.add_callbacks(print_results, print_error) # Results: Row(user_name=u'seb', birth_year=1984, gender=u'M', session_token=None)
Pluggable Async from cassandra.io.libevreactor import LibevConnection cluster.connection_class = LibevConnection session
= cluster.connect()
Lessons Learned • CQL vs Thrift / C* vocabulary •
Row size limit: row sharding • Opscenter for supervising
Time Series Data CREATE TABLE temperature ( sensor_id varchar, ts
timestamp, temperature float, PRIMARY KEY (sensor_id, ts)); compound primary key (partition key, clustering key)
Time Series Data CREATE TABLE temperature_by_day ( sensor_id varchar, date
text, ts timestamp, temperature float, PRIMARY KEY ((sensor_id, date), ts) ) WITH CLUSTERING ORDER BY (ts DESC); reverse order composite partition key
Questions?