Slide 1

Slide 1 text

A Look Inside Realm's Core DB Engine MobileOptimized, July 2015 JP Simard, @simjp, realm.io 1

Slide 2

Slide 2 text

Realm 2

Slide 3

Slide 3 text

What is Realm? —Fast, zero-copy, embedded database —Powers top 10 apps, Fortune 500, 150M+ devices —Object & model-oriented —Full ACID transactions —Cross-platform file format & C++ core —Language bindings (Objective-C, Swift & Android) —Launched summer 2014 3

Slide 4

Slide 4 text

Swift let company = Company() // Standalone Realm Object company.name = "Realm" // etc... let realm = Realm() // Default Realm realm.write { // Transactions realm.add(company) // Persisted Realm Object } // Queries let companies = realm.objects(Company) // Typesafe companies[0].name // => Realm (generics) // "Jack"s who work full time (lazily loaded & chainable) let ftJacks = realm.objects(Employee).filter("name = 'Jack'") .filter("fullTime = true") 4

Slide 5

Slide 5 text

Objective-C // Standalone Realm Object Company *company = [[Company alloc] init]; company.name = @"Realm"; // etc... // Transactions RLMRealm *realm = [RLMRealm defaultRealm]; [realm transactionWithBlock:^{ [realm addObject:company]; }]; RLMResults *companies = [Company allObjects]; // "Jack"s who work full time (lazily loaded & chainable) RLMResults *ftJacks = [[Employee objectsWhere:@"name = 'Jack'"] objectsWhere:@"fullTime == YES"]; 5

Slide 6

Slide 6 text

Java Realm realm = Realm.getInstance(this.getContext()); // Default Realm realm.beginTransaction(); // Transactions Company company = realm.createObject(Company.class); // Persisted dog.setName("Realm"); // etc... realm.commitTransaction(); // Queries Company company = realm.where(Company.class).findFirst(); company.getName; // => Realm // "Jack"s who work full time (lazily loaded & chainable) RealmResults ftJacks = realm.where(Employee.class) .equalTo("name", "Jack") .equalTo("fullTime", true) .findAll(); 6

Slide 7

Slide 7 text

Why did Realm build its own db engine from scratch? 7

Slide 8

Slide 8 text

NoORM —ORM stands for "Object Relational Mapper" —Realm aims to have the features & flexibility of ORMs, without the performance overhead or the leaky abstractions 8

Slide 9

Slide 9 text

9

Slide 10

Slide 10 text

MVCC Algorithm 10

Slide 11

Slide 11 text

11

Slide 12

Slide 12 text

MVCC Algorithm —Multiversion concurrency control —Each transaction has a snapshot of the database —Writes are performed as append-only —It means you can have an immutable view of data —Reads don't block reads or writes —Writes block writes, but not reads 12

Slide 13

Slide 13 text

Native Links 13

Slide 14

Slide 14 text

14

Slide 15

Slide 15 text

Native Links —Realm is a giant B+ tree —To-one & to-many links are 1st class citizens —No expensive join operations for relationships —Just following pointers —Great for object graph traversals 15

Slide 16

Slide 16 text

String & Int Optimizations 16

Slide 17

Slide 17 text

String Optimization —Convert common strings to enums —Expensive-ish operation, but smaller file size & faster reads Integer Packing —Ints take as little space on disk as possible —Little to no performance overhead 17

Slide 18

Slide 18 text

Crash Safety 18

Slide 19

Slide 19 text

Crash Safety —Kernal panics & sudden power loss should never corrupt the database —File format is append-only —Top root node is switched only upon full transactional durability —Runs F_FULLFSYNC on every logical file size change 19

Slide 20

Slide 20 text

Zero Copy 20

Slide 21

Slide 21 text

Traditional ORMs Must Copy 1. Data on disk 2. Read from disk 3. Copy raw data into deserialized intermediate in- memory representation (allocates memory) 4. Copy intermediate representation into language- level in-memory object (allocates memory) 5. Return final object from property access 21

Slide 22

Slide 22 text

Realm Skips The Copy Whole file is memory-mapped & same format on disk as in-memory. 1. Calculate offset of data to read 2. Read from mmapped file 3. Return raw value from property access 22

Slide 23

Slide 23 text

True Lazy Loading 23

Slide 24

Slide 24 text

True Lazy Loading —Impossible to read a single bit from a hard disk —Properties are grouped together —Avoids reading unused properties —Saves disk roundtrips 24

Slide 25

Slide 25 text

Built-In Encryption 25

Slide 26

Slide 26 text

Built-In Encryption —Encrypted at rest on disk —AES-256+SHA2 —All cryptography done in the virtual memory manager —Works by marking mmapped region as readwrite- protected, throwing access violations on access 26

Slide 27

Slide 27 text

Multiprocess Support 27

Slide 28

Slide 28 text

Multiprocess Support —MVCC algorithm makes this work mostly out-of- the-box —Plus a named pipe for change notifications 28

Slide 29

Slide 29 text

Null Values 29

Slide 30

Slide 30 text

Null Values class Conference: Object { dynamic var name: String? = nil } 30

Slide 31

Slide 31 text

Compromises ! 31

Slide 32

Slide 32 text

Long Development Time 32

Slide 33

Slide 33 text

33

Slide 34

Slide 34 text

Pre-1.0 —APIs are in flux —File format could change 34

Slide 35

Slide 35 text

Fewer Features At First 35

Slide 36

Slide 36 text

Moving Forward 36

Slide 37

Slide 37 text

Moving Forward —KVO & fine-grain notifications: pr #2050 —Nullable values: pr #1798 —Query handover between threads: seg-handover 37

Slide 38

Slide 38 text

Links —MVCC on Wikipedia —B+ Trees on Wikipedia —About fsync() —More about fsync() —Realm on GitHub —Realm.io 38

Slide 39

Slide 39 text

MobileOptimized().questions.ask()! JP Simard, @simjp, realm.io 39