Search over encrypted records: from academic dreams to production-ready tool

Search over encrypted records: from academic dreams to production-ready tool

The search over encrypted data is the modern cryptographic engineering problem. We will talk about existing approaches (both well-known and modern), and concentrate on practical solution based on blind index technique to search data in databases. What’s inside: cryptographic and functional schemes, implementation details, practical security evaluation (risk modelling and potential attacks). We will show how theoretical models turn into real, usable, maintainable, security tools.

C4b182066abd8ed2a260fd1f243f25ec?s=128

Artem Storozhuk

May 17, 2019
Tweet

Transcript

  1. Search over encrypted records: from academic dreams to production-ready tool

  2. Artem Storozhuk Security Software Engineer at Cossack Labs dev@cossacklabs.com

  3. Database as a Service (DBaaS)

  4. DBaaS security drawbacks non-sensitive data sensitive data

  5. DBaaS security drawbacks non-sensitive data sensitive data 1. Untrusted DBA.

    2. Hacker with root access. 3. Change of storage provider ownership.
  6. Encryption is a solution 1. Whole database encryption

  7. Encryption is a solution 1. Whole database encryption 2. Searchable

    encryption
  8. Searchable encryption techniques

  9. Searchable encryption techniques SWP, Goh, CM-I, CM-II, CGK+-I, CGK+-II, ABO,

    LSD-I, LSD-II, CK, KO, KPR, KP, GSW-I, GSW-II, BKM, RT, WWP-III, CJJ+, PKL+, ABC+, SSW, LWW+, BTH+, KIK, BC, RVB+, YLW, BCO+, ABC++, BSS-I, CS, Khader, BSS-II, RPS+-I, TC, ZI, RPS+-II, INH+, PKL, PCL, HL, BW, SBC+, BCK, BBO, DRD-I, DRD-II, BDD+, HLm, WWP-I, WWP-II, WWP-IIIm, WWP-IV.
  10. Index-based searchable encryption I - secure index (pointer on encrypted

    message); T - trapdoor (allows server to identify encrypted message without revealing its plaintext);
  11. SECURITY (ability to resist cryptanalytic attacks) EFFICIENCY (query latency) QUERY

    EXPRESSIVENESS (equality, conjunction, comparison, subset, range, wildcard) ARCHITECTURE (outsourcing / sharing) Searchable encryption tradeoff
  12. Searchable encryption security Information about objects that may be leaked:

    1) Order 2) Equalities 3) Predicates 4) Identifiers 5) Structure
  13. Searchable encryption security Information about objects that may be leaked:

    1) Order 2) Equalities 3) Predicates 4) Identifiers 5) Structure Groups of leakage: 1) Secure index metadata 2) Search pattern 3) Access pattern
  14. Model of untrusted storage provider: 1) Honest-but-curious 2) Malicious Searchable

    encryption security Information about objects that may be leaked: 1) Order 2) Equalities 3) Predicates 4) Identifiers 5) Structure Groups of leakage: 1) Secure index metadata 2) Search pattern 3) Access pattern
  15. Model of untrusted storage provider: 1) Honest-but-curious 2) Malicious Searchable

    encryption security Information about objects that may be leaked: 1) Order 2) Equalities 3) Predicates 4) Identifiers 5) Structure Strongest security definition (Curtmola et. al. 2006) [schemes exist only in theory]: Nothing should be leaked. Full security definition (Shen et. al. 2009) [schemes exist with implementation but inefficient in production]: Nothing should be leaked, except access pattern. Groups of leakage: 1) Secure index metadata 2) Search pattern 3) Access pattern
  16. Leakage inference attacks

  17. Count Attack – 40% keyword recovery rate with a 80%

    of dataset known to attacker. Works well if the keyword universe sizes is 5000 at most. Leakage inference attacks
  18. Count Attack – 40% keyword recovery rate with a 80%

    of dataset known to attacker. Works well if the keyword universe sizes is 5000 at most. Leakage inference attacks Hierarchical-Search Attack – extension of the Count Attack, 40% keyword recovery rate under a condition that (at least) 40% of the data leaks. Attacker could inject a set of constructed records.
  19. 1. open source 2. strong & proven 3. fast &

    reliable 4. without security design flaws How we selected SE scheme?
  20. Available SE solutions CryptDB [2011]: - https://css.csail.mit.edu/cryptdb/ - http://people.csail.mit.edu/nickolai/papers/raluca-cryptdb.pdf -

    https://eprint.iacr.org/2015/979.pdf - https://github.com/CryptDB/cryptdb Mylar [2013]: - https://css.csail.mit.edu/mylar/ - https://css.csail.mit.edu/mylar/mylar.pdf - https://github.com/strikeout/mylar CipherSweet [2018] - https://paragonie.com/blog/2019/01/ciphersweet-searchable-encryption-doesn-t-have-be-bitter - https://github.com/paragonie/ciphersweet
  21. CryptDB

  22. CryptDB (onion cryptography) Strong sides: query expressiveness, efficiency Weak side:

    security
  23. Mylar

  24. CipherSweet

  25. CipherSweet 1) INSERT: INSERT INTO test_table(IndexFieldA, FieldA, FieldB) VALUES (MAC(dataA),Encrypt(dataA),dataB)

    2) SELECT: rows = select FieldA, FieldB from test_table where IndexFieldA=MAC(dataA) Decrypt(rows.FieldA) IndexFieldA FieldA FieldB MAC ENCRYPTED dataB ... ... ...
  26. CipherSweet IndexFieldA FieldA FieldB MAC [<32] ENCRYPTED dataB ... ...

    ... IndexFieldA FieldA FieldB MAC [32] ENCRYPTED dataB ... ... ...
  27. CipherSweet MAC length <==> Probability of index collision <==> Probability

    of “false positives” in SELECT response
  28. CipherSweet MAC length <==> Probability of index collision <==> Probability

    of “false positives” in SELECT response Application Database FieldA FieldB ENCRYPTED ... ENCRYPTED ... FieldA FieldB 0x0123456 ... 0x0125676 ...
  29. CipherSweet Application Database FieldA FieldB ENCRYPTED ... ENCRYPTED ... FieldA

    FieldB 0x0123456 ... 0x0125676 ... select * from test_table where FieldA=0x0123456
  30. github.com/cossacklabs/acra www.cossacklabs.com/acra/

  31. Acra – database encryption proxy AcraSE - Data encryption (separate

    keys per app, per user) - Authentication (transport, access control list for applications compartmentalization) - Query policy (a separate SQL firewall module) - Intrusion detection (poison records) - Key management (key rotation utility) - Monitoring and observability (logging, metrics, tracing)
  32. AcraSE cryptographic design

  33. AcraSE cryptographic design Application AcraServer Database Able to encrypt Data

    +/- + - Able to decrypt Data - + - Able to calculate Secure Index - + -
  34. AcraSE cryptographic design INSERT query transparent mode insert into test_table(A,

    B) values (<plaintext>, <plaintext>) changed to insert into test_table(A, B) values (<mac><ciphertext>, <mac><ciphertext>) INSERT query standard mode insert into test_table(A, B) values (<ciphertext>, <ciphertext>) changed to insert into test_table(A, B) values (<mac><ciphertext>, <mac><ciphertext>)
  35. AcraSE cryptographic design SELECT query select * from test_table where

    A=<plaintext> changed to select * from test_table where substring("A" from 1 for MAC_BYTE_LEN)=<mac>
  36. AcraSE configuration Main configuration (YAML) Encryption configuration

  37. AcraSE proxy design benefit

  38. Future work 1) Secure Index truncation and false positives filtering.

    2) Performance evaluation. 3) Extension of query expressiveness. 4) Data entropy learning. github.com/cossacklabs/acra
  39. Conclusions 1) Searchable encryption is modern and not completely stable.

    2) There is a lack of existing SQL solutions. 3) Secure (blind) indexing approach is the one of reliable techniques for building secure SE schemes.
  40. Reading list http://cs.brown.edu/~seny/ https://www.usenix.org/system/files/conference/osd i16/osdi16-papadimitriou.pdf https://subs.emis.de/LNI/Proceedings/Proceedings 228/115.pd https://inst.eecs.berkeley.edu/~cs261/fa 17/scribe/08_28_encdata.pdf

  41. Thank you! Any questions? Artem Storozhuk dev@cossacklabs.com