Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A look into Realm's Core DB engine

A look into Realm's Core DB engine

In this presentation, we explore why Realm opted to build its own custom database engine from scratch rather than building over an established product, the opportunities that's opened, and the compromises that entails.


JP Simard

July 13, 2015

More Decks by JP Simard

Other Decks in Programming


  1. A Look Inside Realm's Core DB Engine MobileOptimized, July 2015

    JP Simard, @simjp, realm.io 1
  2. Realm 2

  3. What is Realm? —Fast, zero-copy, embedded database —Powers top 10

    apps, Fortune 500, 150M+ devices —Object & model-oriented —Full ACID transactions —Cross-platform file format & C++ core —Language bindings (Objective-C, Swift & Android) —Launched summer 2014 3
  4. Swift let company = Company() // Standalone Realm Object company.name

    = "Realm" // etc... let realm = Realm() // Default Realm realm.write { // Transactions realm.add(company) // Persisted Realm Object } // Queries let companies = realm.objects(Company) // Typesafe companies[0].name // => Realm (generics) // "Jack"s who work full time (lazily loaded & chainable) let ftJacks = realm.objects(Employee).filter("name = 'Jack'") .filter("fullTime = true") 4
  5. Objective-C // Standalone Realm Object Company *company = [[Company alloc]

    init]; company.name = @"Realm"; // etc... // Transactions RLMRealm *realm = [RLMRealm defaultRealm]; [realm transactionWithBlock:^{ [realm addObject:company]; }]; RLMResults *companies = [Company allObjects]; // "Jack"s who work full time (lazily loaded & chainable) RLMResults *ftJacks = [[Employee objectsWhere:@"name = 'Jack'"] objectsWhere:@"fullTime == YES"]; 5
  6. Java Realm realm = Realm.getInstance(this.getContext()); // Default Realm realm.beginTransaction(); //

    Transactions Company company = realm.createObject(Company.class); // Persisted dog.setName("Realm"); // etc... realm.commitTransaction(); // Queries Company company = realm.where(Company.class).findFirst(); company.getName; // => Realm // "Jack"s who work full time (lazily loaded & chainable) RealmResults<Employee> ftJacks = realm.where(Employee.class) .equalTo("name", "Jack") .equalTo("fullTime", true) .findAll(); 6
  7. Why did Realm build its own db engine from scratch?

  8. NoORM —ORM stands for "Object Relational Mapper" —Realm aims to

    have the features & flexibility of ORMs, without the performance overhead or the leaky abstractions 8
  9. 9

  10. MVCC Algorithm 10

  11. 11

  12. MVCC Algorithm —Multiversion concurrency control —Each transaction has a snapshot

    of the database —Writes are performed as append-only —It means you can have an immutable view of data —Reads don't block reads or writes —Writes block writes, but not reads 12
  13. Native Links 13

  14. 14

  15. Native Links —Realm is a giant B+ tree —To-one &

    to-many links are 1st class citizens —No expensive join operations for relationships —Just following pointers —Great for object graph traversals 15
  16. String & Int Optimizations 16

  17. String Optimization —Convert common strings to enums —Expensive-ish operation, but

    smaller file size & faster reads Integer Packing —Ints take as little space on disk as possible —Little to no performance overhead 17
  18. Crash Safety 18

  19. Crash Safety —Kernal panics & sudden power loss should never

    corrupt the database —File format is append-only —Top root node is switched only upon full transactional durability —Runs F_FULLFSYNC on every logical file size change 19
  20. Zero Copy 20

  21. Traditional ORMs Must Copy 1. Data on disk 2. Read

    from disk 3. Copy raw data into deserialized intermediate in- memory representation (allocates memory) 4. Copy intermediate representation into language- level in-memory object (allocates memory) 5. Return final object from property access 21
  22. Realm Skips The Copy Whole file is memory-mapped & same

    format on disk as in-memory. 1. Calculate offset of data to read 2. Read from mmapped file 3. Return raw value from property access 22
  23. True Lazy Loading 23

  24. True Lazy Loading —Impossible to read a single bit from

    a hard disk —Properties are grouped together —Avoids reading unused properties —Saves disk roundtrips 24
  25. Built-In Encryption 25

  26. Built-In Encryption —Encrypted at rest on disk —AES-256+SHA2 —All cryptography

    done in the virtual memory manager —Works by marking mmapped region as readwrite- protected, throwing access violations on access 26
  27. Multiprocess Support 27

  28. Multiprocess Support —MVCC algorithm makes this work mostly out-of- the-box

    —Plus a named pipe for change notifications 28
  29. Null Values 29

  30. Null Values class Conference: Object { dynamic var name: String?

    = nil } 30
  31. Compromises ! 31

  32. Long Development Time 32

  33. 33

  34. Pre-1.0 —APIs are in flux —File format could change 34

  35. Fewer Features At First 35

  36. Moving Forward 36

  37. Moving Forward —KVO & fine-grain notifications: pr #2050 —Nullable values:

    pr #1798 —Query handover between threads: seg-handover 37
  38. Links —MVCC on Wikipedia —B+ Trees on Wikipedia —About fsync()

    —More about fsync() —Realm on GitHub —Realm.io 38
  39. MobileOptimized().questions.ask()! JP Simard, @simjp, realm.io 39