Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
Relational / Non-relational Databases Python and Andrew Godwin
Slide 2
Slide 2 text
Introduction Python for 5 years Django core developer Data modelling / visualisation
Slide 3
Slide 3 text
""Andrew speaks English like a machine gun speaks bullets."" Reinout van Rees
Slide 4
Slide 4 text
If I speak too fast - tell me!
Slide 5
Slide 5 text
What is a relational database?
Slide 6
Slide 6 text
A relational database is a “collection of relations”
Slide 7
Slide 7 text
It's what a lot of people are used to.
Slide 8
Slide 8 text
Relational Databases PostgreSQL MySQL SQLite
Slide 9
Slide 9 text
Let's pick PostgreSQL (it's a good choice)
Slide 10
Slide 10 text
Usage conn = psycopg2.connect( host="localhost", user="postgres" ) cursor = conn.cursor() cursor.execute('SELECT * FROM users WHERE username = "andrew";') for row in cursor.fetchall(): print row
Slide 11
Slide 11 text
You've probably seen all that before.
Slide 12
Slide 12 text
Now, to introduce some non-relational databases
Slide 13
Slide 13 text
Document Databases MongoDB CouchDB
Slide 14
Slide 14 text
Key-Value Stores Redis Cassandra
Slide 15
Slide 15 text
Message Queues AMQP Celery
Slide 16
Slide 16 text
Various Others Graph databases Filesystems VCSs
Slide 17
Slide 17 text
Redis and MongoDB are two good examples here
Slide 18
Slide 18 text
Redis: Key-value store with strings, lists, sets, channels and atomic operations.
Slide 19
Slide 19 text
Redis Example conn = redis.Redis(host="localhost") print conn.get("top_value") conn.set("last_user", "andrew") conn.inc("num_runs") conn.sadd("users", "andrew") conn.sadd("users", "martin") for item in conn.smembers("users"): print item
Slide 20
Slide 20 text
MongoDB: Document store with indexing and a wide range of query filters.
Slide 21
Slide 21 text
MongoDB Example conn = pymongo.Connection("localhost") db = conn['mongo_example'] coll = db['users'] coll.insert({ "username": "andrew", "uid": 1000, }) for entry in coll.find({"username": "andrew"}): print entry
Slide 22
Slide 22 text
These all solve different problems - you can't easily replace one with the other.
Slide 23
Slide 23 text
""When all you have is a hammer, everything looks like a nail"" Abraham Manslow (paraphrased)
Slide 24
Slide 24 text
JOIN - your best friend, and your worst enemy.
Slide 25
Slide 25 text
Denormalising your data speeds up reads, and slows down writes.
Slide 26
Slide 26 text
Schemaless != Denormalised
Slide 27
Slide 27 text
Atomic operations are nice. conn.incrby("num_users', 2)
Slide 28
Slide 28 text
But SQL can do some of them. UPDATE foo SET bar = bar + 1 WHERE baz;
Slide 29
Slide 29 text
Redis, the datastructures server. SETNX, GETSET, EXPIRES and friends
Slide 30
Slide 30 text
Locks / Semaphores conn.setnx("lock:foo", time.time() + 3600) val = conn.decr("sem:foo") if val >= 0: ... else: conn.incr("sem:foo")
Slide 31
Slide 31 text
Queues conn.lpush("myqueue", "workitem") todo = conn.lpop("myqueue") (or publish/subscribe)
Slide 32
Slide 32 text
Priority Queues conn.zadd("myqueue", "handle-meltdown", 1) conn.zadd("myqueue", "feed-cats", 5) todo = conn.zrange("myqueue", 0, 1) conn.zrem(todo)
Slide 33
Slide 33 text
Lock-free linked lists! new_id = "bgrdsd" old_end = conn.getset(":end", new_id) conn.set("%s:next" % old_end, new_id)
Slide 34
Slide 34 text
Performance-wise, the less checks/integrity the faster it goes.
Slide 35
Slide 35 text
Maturity can sometimes be an issue, but new features can appear rapidly.
Slide 36
Slide 36 text
You can also use databases for the wrong thing - it often only matters ""at scale""
Slide 37
Slide 37 text
But how does this all relate to Python?
Slide 38
Slide 38 text
Most databases - even new ones - have good Python bindings
Slide 39
Slide 39 text
Postgres: PsycoPG2 Redis: redis-py MongoDB: pymongo (and more - neo4j, VCSen, relational, etc.)
Slide 40
Slide 40 text
Some databases have Python available inside (Postgres has it as an option)
Slide 41
Slide 41 text
Document databases map really well to Python dicts
Slide 42
Slide 42 text
You may find non-relational databases a nicer way to store state - for any app
Slide 43
Slide 43 text
Remember, you might still need transactions/reliability. (Business logic is probably better off on mature systems for now)
Slide 44
Slide 44 text
Overall? Just keep all the options in mind. Don't get caught by trends, and don't abuse your relational store
Slide 45
Slide 45 text
Thanks. Andrew Godwin @andrewgodwin http://aeracode.org