Functional Databases

Functional Databases

Presentation about the path that lead me to "build my own database".

With pieces of SRFI-167 (okvs), SRFI-168 (nstore) and the versioned nstore.

https://hyper.dev

4448810883fb620dc626eedaf18ebb8e?s=128

Amirouche

August 10, 2019
Tweet

Transcript

  1. SRFI-167, SRFI-168 and the Functional Store Amirouche Boubekki https://hyper.dev

  2. Requirements 2/14  ACID  Poly-structured data  Relational 

    Recursive  Text  Time  Space  Embedded in Scheme  Bonus: versioned  Bonus: scale horizontally
  3. SRFI-167: Ordered Key-Value Store (okvs) 3/14 1. Ordered mapping of

    bytevectors i. Keys are bytevectors ii. Keys are ordered iii. Values are bytevectors 2. Data structures: i. Red-black tree ii. B-tree iii. Log-structured merge-tree 3. Existing libraries: i. SRFI-146 ii. LevelDB, RocksDB, LMDB, Tokyo Cabinet, SQLite3 LSM extension, Oracle BerkeleyDB iii. WiredTiger iv. TiKV and FoundationDB 4. Data possibly bigger than memory 5. Data Validation Facility (wip)
  4. OKVS Programming Interface 4/14 1. Mapping procedures: i. (okvs-ref transaction

    key) ii. (okvs-set! transaction key value) iii. (okvs-delete! transaction key) 2. Querying primitives: i. (okvs-range txn start-key start-include? end-key end-include?) ii. (okvs-prefix-range txn prefix) 3. Lexicographic packing: i. Bytes representation of some Scheme objects ii. Preserving natural order iii. ) Ordered mapping of Scheme objects iv. Future SRFI? 4. Subspace i. First abstraction ii. Collocation of keys iii. Keys sharing a common prex 5. Anti-pattern: hiding okvs interface
  5. Ordered Mapping of Scheme objects 5/14 Key Value (news columns)

    (title year author) (news row 1) ("Ordered Key-Value Store" 2019 "AB") (news row 2) ("Scheme Workshop 20th anniversary" 2019 "TG") (news index year 2019 1) () (news index year 2019 2) () (tags columns) (news-pk tag) (tags row 1) (1 "scheme") (tags row 2) (1 "okvs") (tags row 3) (2 "scheme")
  6. SRFI-168: Generic Tuple Store (nstore) 6/14 1. Set of xed-length

    lists, called tuples or chunks 2. Generalisation of triple store 3. Better at representing lists than a property graph (graphdb) 4. Query language is a pattern-matching! 5. Big indexing factor: n  n 2  ! 6. See https://math.stackexchange.com/q/3146568/23663 7. Schema-on-read or Scheme-on-write (pre-write validation)
  7. Versioned Generic Tuple Store (vnstore) 7/14 1. Versioned Generic Tuple

    Store 2. Same interface as nstore + branch management 3. Built on-top of nstore: i. metadata: 4-tuple store ii. snapshot: n-tuple store iii. change: n+3 tuple store 4. Requires lot of disk space In the case of n=3 around 21 times the size of the data 5. Applications: i. Wikidata = WikiBase + triple store ii. Versioning data with code iii. Any workow processus that requires a peer validation 6. Change-mechanic on any structured data (pull / merge requets)
  8. nstore query (1) 8/14 data pattern bindings uid key value

    uid key value P4X432 group oasis (nstore-var 'uid) group oasis ((uid . P4X432)) P4X432 album Denitly Maybe ((uid . O44413)) O44413 group oasis O44413 album Morning Glory Table 1. Albums by Oasis (generator-map->list hashmap->alist (nstore-select txn nstore (list (nstore-var uid) 'group "oasis")))
  9. nstore query (2) 9/14 data pattern bindings uid key value

    uid key value P4X432 group oasis (nstore-var 'uid) album Denitly Maybe P4X432 album Denitly Maybe (nstore-var 'uid) group (nstore-var 'group) O44413 group oasis O44413 album Morning Glory Table 2. Group that authored Denitly Maybe album (generator-map->list (lambda (x) (hasmap-ref x 'group)) (nstore-query (nstore-select txn nstore (list (nstore-var 'uid) 'album "Definitly Maybe")) (nstore-where txn store (list (nstore-var 'uid) 'group (nstore-var 'group)))))
  10. 10/14 data pattern bindings uid key value uid key value

    P4X432 group oasis (nstore-var 'uid) album Denitly Maybe ((uid . P4X432)) P4X432 album Denitly Maybe (nstore-var 'uid) group (nstore-var 'group) O44413 group oasis O44413 album Morning Glory Table 3. Group that authored Denitly Maybe album (generator-map->list (lambda (x) (hasmap-ref x 'group)) (nstore-query ( ( ( ( ( ( ( ( (n n n n n n n n ns s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e- - - - - - - - -s s s s s s s s se e e e e e e e el l l l l l l l le e e e e e e e ec c c c c c c c ct t t t t t t t t t t t t t t t t tx x x x x x x x xn n n n n n n n n n n n n n n n n ns s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e ( ( ( ( ( ( ( ( (l l l l l l l l li i i i i i i i is s s s s s s s st t t t t t t t t ( ( ( ( ( ( ( ( (n n n n n n n n ns s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e- - - - - - - - -v v v v v v v v va a a a a a a a ar r r r r r r r r ' ' ' ' ' ' ' ' 'u u u u u u u u ui i i i i i i i id d d d d d d d d) ) ) ) ) ) ) ) ) ' ' ' ' ' ' ' ' 'a a a a a a a a al l l l l l l l lb b b b b b b b bu u u u u u u u um m m m m m m m m " " " " " " " " "D D D D D D D D De e e e e e e e ef f f f f f f f fi i i i i i i i in n n n n n n n ni i i i i i i i it t t t t t t t tl l l l l l l l ly y y y y y y y y M M M M M M M M Ma a a a a a a a ay y y y y y y y yb b b b b b b b be e e e e e e e e" " " " " " " " ") ) ) ) ) ) ) ) )) ) ) ) ) ) ) ) ) (nstore-where txn store (list (nstore-var 'uid) 'group (nstore-var 'group)))))
  11. 11/14 data pattern bindings uid key value uid key value

    P4X432 group oasis P4X432 album Denitly Maybe ((uid . P4X432) P4X432 album Denitly Maybe P4X432 group (nstore-var 'group) (group . oasis)) O44413 group oasis O44413 album Morning Glory Table 4. Group that authored Denitly Maybe album (generator-map->list (lambda (x) (hasmap-ref x 'group)) (nstore-query (nstore-select txn nstore (list (nstore-var 'uid) 'album "Definitly Maybe")) ( ( ( ( ( ( ( ( (n n n n n n n n ns s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e- - - - - - - - -w w w w w w w w wh h h h h h h h he e e e e e e e er r r r r r r r re e e e e e e e e t t t t t t t t tx x x x x x x x xn n n n n n n n n s s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e ( ( ( ( ( ( ( ( (l l l l l l l l li i i i i i i i is s s s s s s s st t t t t t t t t ( ( ( ( ( ( ( ( (n n n n n n n n ns s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e- - - - - - - - -v v v v v v v v va a a a a a a a ar r r r r r r r r ' ' ' ' ' ' ' ' 'u u u u u u u u ui i i i i i i i id d d d d d d d d) ) ) ) ) ) ) ) ) ' ' ' ' ' ' ' ' 'g g g g g g g g gr r r r r r r r ro o o o o o o o ou u u u u u u u up p p p p p p p p ( ( ( ( ( ( ( ( (n n n n n n n n ns s s s s s s s st t t t t t t t to o o o o o o o or r r r r r r r re e e e e e e e e- - - - - - - - -v v v v v v v v va a a a a a a a ar r r r r r r r r ' ' ' ' ' ' ' ' 'g g g g g g g g gr r r r r r r r ro o o o o o o o ou u u u u u u u up p p p p p p p p) ) ) ) ) ) ) ) )) ) ) ) ) ) ) ) )) ) ) ) ) ) ) ) ) ))
  12. Forward 12/14 1. More abstractions on top of okvs: i.

    row store ii. EAVT iii. document store iv. full-text search v. ranked set for pratical leaderboard vi. space lling curves for pratical spatial queries 2. More tooling on top of nstore and vnstore: i. Querying: SPARQL, datalog, microkanren ii. Inference and reasoner engine iii. Schema migration
  13. Beyond 13/14 1. How to make it easier to work

    with bigger-than-memory data? ) ACID transactions... 2. Experiment with incremental computation e.g. miniAdapton: i. Spreadsheet-like user experience ii. Connect several abstractions together iii. Observer pattern / Ad-hoc indices / Materialized views iv. ) Less code, more performance 3. Domain-Driven-Design: high-level representation interpreted then compiled to: i. Schema ii. Validation iii. Migration iv. Queries v. ???
  14. Thanks! 14/14