Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A look into Realm's Core DB engine

A look into Realm's Core DB engine

In this presentation, we explore why Realm opted to build its own custom database engine from scratch rather than building over an established product, the opportunities that's opened, and the compromises that entails.

JP Simard

July 13, 2015
Tweet

More Decks by JP Simard

Other Decks in Programming

Transcript

  1. What is Realm? —Fast, zero-copy, embedded database —Powers top 10

    apps, Fortune 500, 150M+ devices —Object & model-oriented —Full ACID transactions —Cross-platform file format & C++ core —Language bindings (Objective-C, Swift & Android) —Launched summer 2014 3
  2. Swift let company = Company() // Standalone Realm Object company.name

    = "Realm" // etc... let realm = Realm() // Default Realm realm.write { // Transactions realm.add(company) // Persisted Realm Object } // Queries let companies = realm.objects(Company) // Typesafe companies[0].name // => Realm (generics) // "Jack"s who work full time (lazily loaded & chainable) let ftJacks = realm.objects(Employee).filter("name = 'Jack'") .filter("fullTime = true") 4
  3. Objective-C // Standalone Realm Object Company *company = [[Company alloc]

    init]; company.name = @"Realm"; // etc... // Transactions RLMRealm *realm = [RLMRealm defaultRealm]; [realm transactionWithBlock:^{ [realm addObject:company]; }]; RLMResults *companies = [Company allObjects]; // "Jack"s who work full time (lazily loaded & chainable) RLMResults *ftJacks = [[Employee objectsWhere:@"name = 'Jack'"] objectsWhere:@"fullTime == YES"]; 5
  4. Java Realm realm = Realm.getInstance(this.getContext()); // Default Realm realm.beginTransaction(); //

    Transactions Company company = realm.createObject(Company.class); // Persisted dog.setName("Realm"); // etc... realm.commitTransaction(); // Queries Company company = realm.where(Company.class).findFirst(); company.getName; // => Realm // "Jack"s who work full time (lazily loaded & chainable) RealmResults<Employee> ftJacks = realm.where(Employee.class) .equalTo("name", "Jack") .equalTo("fullTime", true) .findAll(); 6
  5. NoORM —ORM stands for "Object Relational Mapper" —Realm aims to

    have the features & flexibility of ORMs, without the performance overhead or the leaky abstractions 8
  6. 9

  7. 11

  8. MVCC Algorithm —Multiversion concurrency control —Each transaction has a snapshot

    of the database —Writes are performed as append-only —It means you can have an immutable view of data —Reads don't block reads or writes —Writes block writes, but not reads 12
  9. 14

  10. Native Links —Realm is a giant B+ tree —To-one &

    to-many links are 1st class citizens —No expensive join operations for relationships —Just following pointers —Great for object graph traversals 15
  11. String Optimization —Convert common strings to enums —Expensive-ish operation, but

    smaller file size & faster reads Integer Packing —Ints take as little space on disk as possible —Little to no performance overhead 17
  12. Crash Safety —Kernal panics & sudden power loss should never

    corrupt the database —File format is append-only —Top root node is switched only upon full transactional durability —Runs F_FULLFSYNC on every logical file size change 19
  13. Traditional ORMs Must Copy 1. Data on disk 2. Read

    from disk 3. Copy raw data into deserialized intermediate in- memory representation (allocates memory) 4. Copy intermediate representation into language- level in-memory object (allocates memory) 5. Return final object from property access 21
  14. Realm Skips The Copy Whole file is memory-mapped & same

    format on disk as in-memory. 1. Calculate offset of data to read 2. Read from mmapped file 3. Return raw value from property access 22
  15. True Lazy Loading —Impossible to read a single bit from

    a hard disk —Properties are grouped together —Avoids reading unused properties —Saves disk roundtrips 24
  16. Built-In Encryption —Encrypted at rest on disk —AES-256+SHA2 —All cryptography

    done in the virtual memory manager —Works by marking mmapped region as readwrite- protected, throwing access violations on access 26
  17. Multiprocess Support —MVCC algorithm makes this work mostly out-of- the-box

    —Plus a named pipe for change notifications 28
  18. 33

  19. Moving Forward —KVO & fine-grain notifications: pr #2050 —Nullable values:

    pr #1798 —Query handover between threads: seg-handover 37
  20. Links —MVCC on Wikipedia —B+ Trees on Wikipedia —About fsync()

    —More about fsync() —Realm on GitHub —Realm.io 38