Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data is a new security boundary

042b7c0e45c53de46667f07de2fb2614?s=47 vixentael
November 15, 2021

Data is a new security boundary

We will discuss how companies use cryptography as an ultimate security control for data. When data is properly encrypted, it can’t be suddenly, unnoticeably decrypted.

End-to-end encryption flow for the NoCode platform? Sure. DRM-like protection with application-level encryption using HPKE-like approach for protecting ML models? Yes. End-to-end encrypted message exchange for CRDT-based real-time syncing app? Yep.

But cryptography requires a set of supporting security controls: API protection, anti-fraud scoring system, mobile device attestation, root/jailbreak detection, authN-authZ, audit logging, and so on.

Let’s talk about how “strong cryptography” becomes “real-world security boundary around sensitive data” and what it takes in different contexts.



November 15, 2021

More Decks by vixentael

Other Decks in Programming


  1. Data is a new security boundary @vixentael

  2. @vixentael Head of customer solutions, Security software engineer at Cossack

    Labs. I’m focused on data security and applied cryptography, building e2ee protocols, and security controls around crypto. Core maintainer of Themis cryptolib. cossacklabs.com
  3. @vixentael Things we won’t talk about Exact ciphers, symmetric vs

    asymmetric encryption. TLS. Typical cryptographic mistakes developers do. Privacy, and evil corporations. Recent data incidents and breaches. FUD. What library / tool to use to encrypt your data the best way.
  4. @vixentael What we will talk about Data security 101: encryption,

    OWASP, regulations. Cases: observations of real apps that combine encryption with supporting security controls. Encryption ways: ALE, FLE, E2EE, ZKA/ZT.
  5. Data security 101

  6. @vixentael Your app’s data is everywhere. (apps, public clouds, databases,

    backups, 3rd APIs, analytics, etc). No perimeters, no trusted zones anymore.
  7. @vixentael

  8. @vixentael Your app’s data is everywhere. (apps, public clouds, databases,

    backups, 3rd APIs, analytics, etc). No perimeters, no trusted zones anymore. Data security measures become security boundary for data. It's not about "protect the data where it's stored". It’s “protect the data whenever it exists”.
  9. @vixentael Data security depends on a data fl ow gathering

    secure processing secure storage and backups secure disclosure data removal data migration Gathering Processing Output logging, analytics
  10. @vixentael Data security depends on a data fl ow gathering

    secure processing secure storage and backups secure disclosure data removal data migration Gathering Processing Output logging, analytics leakage, loss, disclosure never removed, disclosure, loss gathering without consent
  11. @vixentael Data security 101 1. Identify sensitive data, understand sensitive

    data lifecycle, classify data. 2. Identify risks to data. 3. Build trust models, understand risk impact. 4. Prioritize threat vectors. 5. Select and implement proper security controls for exploitable high risk vectors (to prevent risks and to identify leaks).
  12. @vixentael Data security 101 1. Identify sensitive data, understand sensitive

    data lifecycle, classify data. 2. Identify risks to data. 3. Build trust models, understand risk impact. 4. Prioritize threat vectors. 5. Select and implement proper security controls for exploitable high risk vectors (to prevent risks and to identify leaks).
  13. Encryption

  14. @vixentael Encryption is an ultimate data security measure 1. When

    data is properly encrypted, it can’t be suddenly, unnoticeably decrypted. 2. Even if leaked, data is encrypted. Encryption is the best access control. 3. Protections against insiders & outsiders. 4. Properly con fi gured encryption allows mistakes in other security controls. 5. Regulations, compliance.
  15. @vixentael Regulations, compliance media.defense.gov/2018/Apr/22/2001906836/-1/-1/0/ DEFENSEINNOVATIONBOARD_TEN_COMMANDMENTS_OF_SOFTWARE_2018.04.20.PDF

  16. @vixentael Regulations, compliance gdpr-info.eu/issues/encryption/ GDPR art 32/35: responsibly store and

    process data according to risks. GDPR art 33/34: detecting data leakage and alert users & controller.
  17. @vixentael Regulations, compliance cossacklabs.com/blog/what-we-need-to-encrypt-cheatsheet.html also, HIPAA, FISMA, FIPS, FedRAMP, CCPA,

    PCI DSS, FERPA, and many more
  18. @vixentael OWASP Top10 2021 A01:2021-Broken Access Control. A02:2021-Cryptographic Failures. A03:2021-Injection.

    A04:2021-Insecure Design. A05:2021-Security Miscon fi guration. A06:2021-Vulnerable and Outdated Components. A07:2021-Identi fi cation and Authentication Failures. A08:2021-Software and Data Integrity Failures. A09:2021-Security Logging and Monitoring Failures. A10:2021-Server-Side Request Forgery. owasp.org/Top10/
  19. @vixentael A02:2021-Cryptographic Failures. owasp.org/Top10/A02_2021-Cryptographic_Failures/ Focused mostly on crypto usage and

    implementation. Bad ciphers: old and insecure? Wrong AES modes? AES-CBC instead of AES-GCM? Asymmetric encr where symmetric should be used? Bad keys: short, low entropy keys? Math.random? User password used as encryption keys without a proper KDF? Bad KDF choice / params? Unsuitable crypto-primitives choice: MD5 instead of Argon2DI, AES- OFB instead of GCM. SHA-256 instead of HMAC-SHA256. Home-brewed crypto.
  20. @vixentael A04:2021-Insecure Design. owasp.org/Top10/A04_2021-Insecure_Design/ Focused on design, missing or wrong

    security controls. Bad key management: storing encryption keys in plaintext together with data. Lack of rotation, revocation, expiration? Components are trusted when they shouldn’t be? One encryption key for everything? Lack of PKI? Lack of encryption for sensitive assets. Home-brewed encryption protocols. Encryption is not supported by authN, access control, logging & monitoring.
  21. @vixentael OWASP ASVS github.com/OWASP/ASVS V6: Stored Cryptography, V8: Data Protection,

    V9: Communications
  22. @vixentael OWASP MASVS github.com/OWASP/owasp-masvs/ V3: Cryptography, V2: Data Storage, V5:

    Network Communication
  23. @vixentael *ASVS

  24. Encryption models and ways

  25. @vixentael Encryption Data stored encrypted locally – data-at-rest encryption; also

    FS/OS encryption, database encryption. host OS / server app host OS / server app Transport layer encryption – data-in-transit encryption (TLS, IPSec, SSH). host OS / server app
  26. @vixentael Application-level encryption (ALE) Encryption process happening within application context,

    triggered by an application. ALE could work together with data-at-rest encryption and data- in-transit encryption. ALE could be client-side, server-side, end-to-end, etc. infoq.com/articles/ale-software-architects/ Encryption is easy, key management is hard.
  27. @vixentael TLS (in transit) application-level encryption server 1 server 2

    server 3 Alice Carol Bob server 1 server 2 server 3 Alice Carol Bob encrypted encrypted infoq.com/articles/ale-software-architects/
  28. @vixentael Application-level encryption data encrypted by any app – application-

    level encryption (ALE) app ALE happens on a client side – client-side encryption client ALE happens on a server side – server-side encryption server proxy … proxy-side encryption infoq.com/articles/ale-software-architects/
  29. @vixentael Field-level encryption Only some data fi elds are encrypted

    – fi eld-level encryption (FLE). { "name": base64_str(encrypted_name), 
 "phone": base64_str(encrypted_phone), 
 "passport": base64_str(encrypted_passport), "ID": user_ID, "last_activity_date": timestamp, ... }
  30. @vixentael End-to-end encryption Alice App-side encryption when no keys/ secrets/data

    is available to the intermediate infrastructure – end-to-end encryption. Bob speakerdeck.com/vixentael/e2ee-equals-security-equals-privacy Encryption should work on all selected platforms. Key management is tricky – backend should work only as key discovery service without access to private/secret keys. Complicated to design, easy to maintain, hard to debug.
  31. @vixentael TLS and ALE have different threat models, it’s unfair

    to compare them, but we will :)
  32. @vixentael encryption controls / events transit (TLS) disk / FS

    TDE / DB encryption ALE E2EE physical access to servers ⛔ ✅ ✅ ✅ ✅ MitM ✅ ⛔ ⛔ ✅ ✅ privileged DB access ⛔ ⛔ Depends ✅ ✅ privileged system access ⛔ ⛔ ⛔ Depends ✅ backups, logs, snapshots ⛔ ⛔ Depends ✅ ✅ infoq.com/articles/ale-software-architects/
  33. @vixentael If E2EE is so great, why we don’t use

    it everywhere? TLS FS/OS encr, TDE custom data-at- rest encryption ALE E2EE security efforts, tradeoffs key storage, key rotation, key revocation, data re-encryption, consistency, backups, tying keys w/ identity, search in encrypted data, logging monitoring, and all the NIST SP 800-57, 800-53.
  34. @vixentael Zero Trust / Zero Trust Architecture – assumes there

    is no implicit trust granted to assets or user accounts based solely on their physical or network location. No asset is inherently trusted. nist.gov/publications/zero-trust-architecture ZT is more about access control and authN than encryption.
  35. @vixentael Zero Knowledge Architecture (ZKA) – system where no one

    has access to unencrypted data, except the user (node, service, person). Also known as “No Knowledge” Systems. Typically built on E2EE + strong authN + privacy-respectful design. See also: ZKP, SMP, PAKE, OPAQUE; FHE, searchable encryption. cossacklabs.com/solutions/e2ee-zero-trust/
  36. @vixentael Searchable encryption Perform queries on encrypted data without decryption.

    Different schemes are possible: SSE, PEKS, blind index, (F)HE, etc. cossacklabs.com/blog/secure-search-over-encrypted-data-acra-se/ eprint.iacr.org/2019/806.pdf Most realistic: keyword search (blind index).
  37. @vixentael *ASVS ALE, E2EE

  38. @vixentael Other exciting crypto terms Privacy-enhancing cryptography: SMPC, PSI, PIR,

    FHE, PAKE, OPAQUE. ZK: ZKP, zk-SNARKs, zk-STARKs, zk-SNORKs :) nist.gov/blogs/cybersecurity-insights/privacy-enhancing-cryptography-complement-differential-privacy Crypto reinforced guarantees in data structures: blockchain, Merkle-tree. PQC. hackernoon.com/eli5-zero-knowledge-proof-78a276db9eff cossacklabs.com/blog/crypto-signed-audit-logs.html aumasson.jp/data/talks/quantum-poc-2021.pdf blog.cloud fl are.com/opaque-oblivious-passwords/ blog.cloud fl are.com/the-tls-post-quantum-experiment/
  39. @vixentael SMPC, PIR, zk- SNARKs, PQC

  40. Crypto is more useful when integrated with traditional security controls.

  41. @vixentael data encryption access control, authN transport encryption access logging

    honeypots, SIEMs
  42. @vixentael

  43. AAA WAF honey pots IDS infra mngmt compartmentalization access logging

    jailbans monitoring data fi rewall SIEM HIDS DAST SAST HSM PKI TPM honey tokens dependency mngmt UEBA IAM TLS TDE @vixentael API protection obfuscation anti-RE csrc.nist.gov/publications/detail/sp/800-53/rev-5/ fi nal RTFM
  44. @vixentael Security controls to support crypto 1. Use encryption to

    protect sensitive data globally during the whole data fl ow. 2. Whatever is the attack vector, there is a defense layer. 3. For most popular attack vectors, set up as many independent and overlapping defenses as possible. ✅ ✅ ✅
  45. @vixentael *ASVS ALE, E2EE crypto + security controls

  46. Let’s see some real-world cases

  47. Field-level data encryption for SaaS platforms

  48. @vixentael Who we are and what we want Huge B2B

    SaaS platforms. Protect from insiders, provide transparency, detect malicious users. Encrypt data per customer, using different keys, BYOK. Minimize lifecycle of plaintext data – use encryption as early as possible to the data generation point.
  49. application @vixentael Client-side fi eld-level encryption MongoDB MongoDB SDK MongoDB

    stores records with encrypted fi elds encryption / decryption TLS writes records with encrypted fi elds reads records with encrypted fi elds TLS docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide key vault, KMS
  50. @vixentael Key hierarchy docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide MongoDB fi eld, encrypted

    by DEK Key Vault DEK, encrypted by CMK KMS CMK millions dozens 1
  51. @vixentael Pros & Cons docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide Extremely useful when

    you have MongoDB. Client apps shares responsibility for en/decryption and key management due to exposure to key material. Deterministic & non-deterministic encryptions available. Support for different encryption keys (DEK).
  52. application @vixentael Proxy-side fi eld-level encryption Acra github.com/cossacklabs/acra Database stores

    records with encrypted fi elds writes records with encrypted fi elds reads records with encrypted fi elds Acra proxy encryption / decryption TLS TLS TLS key vault, KMS
  53. @vixentael Key hierarchy Database fi eld encrypted by DEK, DEK

    encrypted by KEK Key Vault KEK, encrypted by CMK KMS CMK millions dozens 1 github.com/cossacklabs/acra
  54. @vixentael Pros & Cons Neither database, not application doesn’t know

    that the data is encrypted. Proxy app is responsible for en/decryption. Easy to scale and build DAO-based architectures. Non-deterministic and searchable encryptions available. Support for different encryption keys, BYOK, key rotation, revocation, etc. Easy-to-armor a single encryption layer with speci fi c controls: DLP, anomaly detection, fi rewaling, anonymisation, etc. github.com/cossacklabs/acra
  55. @vixentael ALE for NoCode platform API frontend database DAO encryption

    integration API frontend database DAO encryption integration customer’s pod NoCode platform tech logs, analytics ... ...
  56. @vixentael ALE for NoCode platform MongoDB key vault, KMS DAO

    + ALE encryption module API frontend fi elds are encrypted + TLS fi elds are plaintext + TLS
  57. @vixentael Crypto + supporting controls 1. Key management, separate key

    per customer (BYOK). 2. Full compartmentalization: customer’s data is located in different DBs, encrypted by different key, each app uses its own DAO. 3. Full transparency — the platform doesn’t have access to customer’s data. 1. Logging, monitoring. 2. Alerting on suspicious activity, fi rewaling. 3. ASVS: API protection, anti throttling, anti-fraud, access control …
  58. @vixentael ALE for fi ntech platform Service1 PostgreSQL encryption /

    decryption layer Service2 BI Analytics ServiceN ... DAO1 DAON load balancing ... MySQL key vault, KMS ... fi elds are encrypted + TLS fi elds are plaintext + TLS
  59. @vixentael Crypto + supporting controls 1. Key management, separate keys

    per domain. 2. Decryption & anonymisation of data for BI software. 3. Isolation of sensitive data from non-sensitive. 1. Logging, monitoring. 2. PCI DSS audit logging + crypto-veri fi able logging. 3. Alerting on suspicious activity. 4. AppSec measures for DAO.
  60. Crypto-based ML models protection

  61. @vixentael Who we are and what we want AI/ML-driven product

    with unique IP. Paid feature. Server-side generates ML models, mobile-side executes them. ML models are unique per user, per app, per request (IML). Protecting them is crucial. leakage of IP loss of IP, competitor advantage, investments into updating ML model. Losing 1 IML is not a problem, losing many IML is. broken apps, clones apps, API fraud abuse of infrastructure, revenue loss, abuse of IP, competitor advantage, reputation risks
  62. API @vixentael IML data fl ow user data GCE worker,

    TF native iOS app native Android app GCP, storing IMLs training servers main ML infra generating IMLs
  63. @vixentael Encryption layer API GCE worker, TF native iOS app

    native Android app GCP, storing IMLs encrypts each IML stores encrypted decrypts IML decrypts IML
  64. @vixentael Encryption scheme GCE worker, TF IML encryptedIML generation encryption

    storage transfer + TLS transfer + TLS decryption re-encryption & storage execution encryptedIML encryptedIML IML IML encryptedIML IMLs are encrypted after generation using ALE, unique keys per each encryption. Transmitted using TLS. Then re-encrypted on device. github.com/cossacklabs/themis
  65. @vixentael IML encryption & decryption GCE worker, TF 1. Generate

    keypair. Send app.publicKey to backend. 2. Generate keypair. Use server.privateKey and app.publicKey to derive sharedKey (ECDH). 3. Generate random DEK. 4. Encrypt IML using DEK, AES-256-GCM. 5. Encrypt DEK using sharedKey, AES-256-GCM. 6. Send { encryptedIML, encryptedDEK, server.publicKey }. 7. Receive. Use app.privateKey and server.publicKey to derive sharedKey. 8. Decrypt DEK, decrypt IML.
  66. @vixentael IML format { "data": base64_str(encrypted_IML), "key": base64_str(encrypted_DEK), "public_key": server_ephemeral_public_key,

    "version": MODEL_VERSION, "layers": { // additional ML layers encryption } } speakerdeck.com/vixentael/cryptographic-protection-of-ml-models
  67. @vixentael Hybrid Public Key Encryption (HPKE) datatracker.ietf.org/doc/draft-irtf-cfrg-hpke/ encrypt data with

    symmetric key using AEAD; encapsulate symmetric key with public key scheme RFC describes approach used before and implies standardization.
  68. @vixentael Supporting security controls API native iOS app native Android

    app GCP, storing IMLs AuthN AuthN TTL crypto anti- RE AppSec crypto anti- RE AppSec API sec anti- fraud crypto crypto ACL logging monitoring AppSec GCE worker, TF
  69. @vixentael Cloud storage security 101 1. IMLs are stored min

    time – apps are expected to grab their IML quickly. 2. URL TTL (expire after mins). 3. URL authentication & access control. 4. Clean up IML fi les (every hour). 5. Do not backup IMLs. 6. URLs are not logged. 7. Monitoring of access errors. (also see OWASP WSTG-CONF-11) AuthN TTL ACL
  70. @vixentael API protection 101 1. User authN, IMLs are available

    only after successful authN. 2. API limits, requests throttling, fi rewalling. 3. IML request limits – after N model requests, server returns error. (also see OWASP ASVS :) ) API AuthN AppSec API sec
  71. @vixentael Anti-fraud system 201 1. Limit access to IML based

    on user behaviour. 2. Gather events from mobile apps and from server side. 3. Calculate user scoring based on events (“stop- factors”, rules). 4. User scoring: OK, suspicious, malicious. 5. Block malicious, limit suspicious. anti- fraud API
  72. @vixentael Anti-fraud system 201 JB detected same public key, different

    device invalid app signature remote device attestation failed 🛑 stop factors } URL download failure app reinstall too many requests keychain not accessible 🤨 implicative rules } wrong API version … honey token deviceID … malicious suspicious OK
  73. @vixentael Remote device attestation developer.apple.com/ documentation/devicecheck Apple DeviceCheck developer.android.com/training/ safetynet/attestation

    Android SafetyNet 1. Use as part of user authN. 2. Use as source for anti-fraud system. 3. Block apps installed not from stores.
  74. @vixentael Read the full story! speakerdeck.com/vixentael/cryptographic-protection-of-ml-models Cryptographic protection of ML

  75. CRDT & E2EE

  76. @vixentael Who we are and what we want CRDT-based mobile-

    fi rst product. Users create shared spaces and collaborate on visuals and texts together. Encrypt users’ data but allow them to collaborate. speakerdeck.com/ept/adapting-secure-group-messaging-for-encrypted-crdts Martin Kleppmann discussed other approaches 1 2 3 4 1 2 5 1 2 3 4 5
  77. @vixentael CRDT log encryption strategy 1. The main problem –

    how to reduce problem to a typical one. 2. We selected document-based encryption, not chat-based. 3. Encrypt payload or action + payload. Trade-off: the more server knows the faster are merges; the less server knows – the better security. 4. Use the same encryption key for all entries of the document. 5. Tricky part: “invite” and “revoke” users: • give users access to the Document Key (“invite”) by encrypting it for each user. • de fi ne key rotation period. • pre-keying, double ratchet – overkill.
  78. log entries protection (e2ee) @vixentael

  79. passphrase encryption hint encryption zeroing secrets secure key sharing auto-locking

    timer failed attempts counter encrypted user settings log entries protection (e2ee) obfuscation anti-RE & anti-debugging good TLS @vixentael authZ / authN
  80. @vixentael *ASVS ALE, E2EE crypto + security controls

  81. Encryption is not that hard. Key management is a bit

  82. Encryption is not that hard. Key management is a bit

    harder. Crypto + key management + data fl ow control + security controls… Welcome to the real world :)
  83. @vixentael Don’t hesitate to talk to me if you have

    questions about data security and cryptography. Esp E2EE. vixentael.dev cossacklabs.com