Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cassandra for Pythonistas
Search
Sébastien Béal
September 14, 2013
Programming
1
81
Cassandra for Pythonistas
Talk given at PyCon APAC 2013 on Cassandra drivers for Python with a focus on cassandra-driver.
Sébastien Béal
September 14, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
testingを眺める
matumoto
1
140
テストコードはもう書かない:JetBrains AI Assistantに委ねる非同期処理のテスト自動設計・生成
makun
0
610
スケールする組織の実現に向けた インナーソース育成術 - ISGT2025
teamlab
PRO
2
180
Ruby Parser progress report 2025
yui_knk
1
460
Android端末で実現するオンデバイスLLM 2025
masayukisuda
1
180
そのAPI、誰のため? Androidライブラリ設計における利用者目線の実践テクニック
mkeeda
2
4.8k
時間軸から考えるTerraformを使う理由と留意点
fufuhu
16
4.8k
OSS開発者という働き方
andpad
5
1.7k
はじめてのMaterial3 Expressive
ym223
2
950
基礎から学ぶ大画面対応(Learning Large-Screen Support from the Ground Up)
tomoya0x00
0
6.5k
意外と簡単!?フロントエンドでパスキー認証を実現する WebAuthn
teamlab
PRO
2
790
Reading Rails 1.0 Source Code
okuramasafumi
0
260
Featured
See All Featured
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
61k
A Tale of Four Properties
chriscoyier
160
23k
How to Think Like a Performance Engineer
csswizardry
26
1.9k
Embracing the Ebb and Flow
colly
87
4.8k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
18
1.1k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
The Language of Interfaces
destraynor
161
25k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
Done Done
chrislema
185
16k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
850
Transcript
Cassandra for Pythonistas Sébastien Béal PyCon APAC 2013 09/14/2013
Cassandra for Pythonistas Humans ...or not
whoami locarise sebastibe @gmail.com CEO and Co-Founder @ in Tokyo
2009-2012
Why Cassandra? Connect all the things!
Distributed column-based key- value store (schema optional) Released 2.0 on
September 3rd BigTable Dynamo 2009
Architecture Cluster Node Seed Seed Ring Gossip Snitch
Other Features • Partitioner • Data replication: ‣ Simple Strategy
(1 datacenter) ‣ Network Topology Strategy • Compaction
Data Model keyspace column family column family row row row
column column column row row row row super column super column super column super column column column column column
Data Model column family = {row key: {column name: value}
} column family = {row key: {super column name: {column name: value} } }
Composite column family = {(key1, key2): {(name1, name2): value} }
composite key composite column name
Communication • Thrift • Cassandra Query Language (CQL) • CQL
2 • CQL 3 (Cassandra 1.2.x) • CQL 3.1 (Cassandra 2.0+)
None
Cassandra & Python
Python Packages • Pycassa (Thrift) • Telephus (Thrift, twisted) •
Silverberg (CQL, twisted) • cassandra-dbapi2 (CQL, PEP249) • cassandra-driver (CQL3, libev)
Python 3 http://python3wos.appspot.com/
cassandra driver • Released in August 2013 • Designed for
CQL • Replacement for Pycassa Still in Beta!
CQL • “Denormalized SQL” ‣ No joins ‣ No sub-queries
‣ No aggregation ‣ Limited ORDER BY
Keyspace from cassandra.cluster import Cluster cluster = Cluster() session =
cluster.connect() session.execute("CREATE KEYSPACE Keyspace WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1};") session.set_keyspace("Keyspace")
Column Family session.execute("CREATE TABLE users (" "username varchar," "gender varchar,"
"session_token varchar," "birth_year bigint," "PRIMARY KEY (user_name));")
Prepared Statement query = "INSERT INTO users (username, gender, birth_year)
VALUES (?, ?, ?)" prepared = session.prepare(query) session.execute(prepared.bind(('seb', 'M', 1984)))
Prepared Statement from cassandra.query import ValueSequence users = ('alice', 'bob',
'seb') query = "SELECT * FROM users WHERE user_id IN ?" session.execute(query, parameters=[ValueSequence(users)])
Decoder session.execute("SELECT * FROM users") # [Row(username=u'seb', birth_year=1984, gender=u'M', session_token=None)]
from cassandra.decoder import ordered_dict_factory session.row_factory = ordered_dict_factory session.execute("SELECT * FROM users") # [OrderedDict([(u'user_name', u'seb'), ( u'birth_year', 1984), (u'gender', u'M'), (u'session_token', None)])]
Async Calls future = session.execute_async("SELECT * FROM users") def print_results(results):
for row in results: print "Results: %s" % row def print_error(exc): print "Operation failed: %s" % exc future.add_callbacks(print_results, print_error) # Results: Row(user_name=u'seb', birth_year=1984, gender=u'M', session_token=None)
Pluggable Async from cassandra.io.libevreactor import LibevConnection cluster.connection_class = LibevConnection session
= cluster.connect()
Lessons Learned • CQL vs Thrift / C* vocabulary •
Row size limit: row sharding • Opscenter for supervising
Time Series Data CREATE TABLE temperature ( sensor_id varchar, ts
timestamp, temperature float, PRIMARY KEY (sensor_id, ts)); compound primary key (partition key, clustering key)
Time Series Data CREATE TABLE temperature_by_day ( sensor_id varchar, date
text, ts timestamp, temperature float, PRIMARY KEY ((sensor_id, date), ts) ) WITH CLUSTERING ORDER BY (ts DESC); reverse order composite partition key
Questions?