Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cassandra for Pythonistas
Search
Sébastien Béal
September 14, 2013
Programming
1
82
Cassandra for Pythonistas
Talk given at PyCon APAC 2013 on Cassandra drivers for Python with a focus on cassandra-driver.
Sébastien Béal
September 14, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
humanlayerのブログから学ぶ、良いCLAUDE.mdの書き方
tsukamoto1783
0
190
CSC307 Lecture 06
javiergs
PRO
0
680
Patterns of Patterns
denyspoltorak
0
1.4k
AIエージェントのキホンから学ぶ「エージェンティックコーディング」実践入門
masahiro_nishimi
5
420
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
680
15年続くIoTサービスのSREエンジニアが挑む分散トレーシング導入
melonps
2
190
例外処理とどう使い分ける?Result型を使ったエラー設計 #burikaigi
kajitack
16
6k
高速開発のためのコード整理術
sutetotanuki
1
390
AI Agent の開発と運用を支える Durable Execution #AgentsInProd
izumin5210
7
2.3k
0→1 フロントエンド開発 Tips🚀 #レバテックMeetup
bengo4com
0
560
20260127_試行錯誤の結晶を1冊に。著者が解説 先輩データサイエンティストからの指南書 / author's_commentary_ds_instructions_guide
nash_efp
1
940
インターン生でもAuth0で認証基盤刷新が出来るのか
taku271
0
190
Featured
See All Featured
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Faster Mobile Websites
deanohume
310
31k
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
210
First, design no harm
axbom
PRO
2
1.1k
Sam Torres - BigQuery for SEOs
techseoconnect
PRO
0
180
Mind Mapping
helmedeiros
PRO
0
80
We Are The Robots
honzajavorek
0
160
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
160
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
0
320
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
160
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.7k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
330
Transcript
Cassandra for Pythonistas Sébastien Béal PyCon APAC 2013 09/14/2013
Cassandra for Pythonistas Humans ...or not
whoami locarise sebastibe @gmail.com CEO and Co-Founder @ in Tokyo
2009-2012
Why Cassandra? Connect all the things!
Distributed column-based key- value store (schema optional) Released 2.0 on
September 3rd BigTable Dynamo 2009
Architecture Cluster Node Seed Seed Ring Gossip Snitch
Other Features • Partitioner • Data replication: ‣ Simple Strategy
(1 datacenter) ‣ Network Topology Strategy • Compaction
Data Model keyspace column family column family row row row
column column column row row row row super column super column super column super column column column column column
Data Model column family = {row key: {column name: value}
} column family = {row key: {super column name: {column name: value} } }
Composite column family = {(key1, key2): {(name1, name2): value} }
composite key composite column name
Communication • Thrift • Cassandra Query Language (CQL) • CQL
2 • CQL 3 (Cassandra 1.2.x) • CQL 3.1 (Cassandra 2.0+)
None
Cassandra & Python
Python Packages • Pycassa (Thrift) • Telephus (Thrift, twisted) •
Silverberg (CQL, twisted) • cassandra-dbapi2 (CQL, PEP249) • cassandra-driver (CQL3, libev)
Python 3 http://python3wos.appspot.com/
cassandra driver • Released in August 2013 • Designed for
CQL • Replacement for Pycassa Still in Beta!
CQL • “Denormalized SQL” ‣ No joins ‣ No sub-queries
‣ No aggregation ‣ Limited ORDER BY
Keyspace from cassandra.cluster import Cluster cluster = Cluster() session =
cluster.connect() session.execute("CREATE KEYSPACE Keyspace WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1};") session.set_keyspace("Keyspace")
Column Family session.execute("CREATE TABLE users (" "username varchar," "gender varchar,"
"session_token varchar," "birth_year bigint," "PRIMARY KEY (user_name));")
Prepared Statement query = "INSERT INTO users (username, gender, birth_year)
VALUES (?, ?, ?)" prepared = session.prepare(query) session.execute(prepared.bind(('seb', 'M', 1984)))
Prepared Statement from cassandra.query import ValueSequence users = ('alice', 'bob',
'seb') query = "SELECT * FROM users WHERE user_id IN ?" session.execute(query, parameters=[ValueSequence(users)])
Decoder session.execute("SELECT * FROM users") # [Row(username=u'seb', birth_year=1984, gender=u'M', session_token=None)]
from cassandra.decoder import ordered_dict_factory session.row_factory = ordered_dict_factory session.execute("SELECT * FROM users") # [OrderedDict([(u'user_name', u'seb'), ( u'birth_year', 1984), (u'gender', u'M'), (u'session_token', None)])]
Async Calls future = session.execute_async("SELECT * FROM users") def print_results(results):
for row in results: print "Results: %s" % row def print_error(exc): print "Operation failed: %s" % exc future.add_callbacks(print_results, print_error) # Results: Row(user_name=u'seb', birth_year=1984, gender=u'M', session_token=None)
Pluggable Async from cassandra.io.libevreactor import LibevConnection cluster.connection_class = LibevConnection session
= cluster.connect()
Lessons Learned • CQL vs Thrift / C* vocabulary •
Row size limit: row sharding • Opscenter for supervising
Time Series Data CREATE TABLE temperature ( sensor_id varchar, ts
timestamp, temperature float, PRIMARY KEY (sensor_id, ts)); compound primary key (partition key, clustering key)
Time Series Data CREATE TABLE temperature_by_day ( sensor_id varchar, date
text, ts timestamp, temperature float, PRIMARY KEY ((sensor_id, date), ts) ) WITH CLUSTERING ORDER BY (ts DESC); reverse order composite partition key
Questions?