Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data is a new security boundary

vixentael
November 15, 2021

Data is a new security boundary

We will discuss how companies use cryptography as an ultimate security control for data. When data is properly encrypted, it can’t be suddenly, unnoticeably decrypted.

End-to-end encryption flow for the NoCode platform? Sure. DRM-like protection with application-level encryption using HPKE-like approach for protecting ML models? Yes. End-to-end encrypted message exchange for CRDT-based real-time syncing app? Yep.

But cryptography requires a set of supporting security controls: API protection, anti-fraud scoring system, mobile device attestation, root/jailbreak detection, authN-authZ, audit logging, and so on.

Let’s talk about how “strong cryptography” becomes “real-world security boundary around sensitive data” and what it takes in different contexts.

vixentael

November 15, 2021
Tweet

More Decks by vixentael

Other Decks in Programming

Transcript

  1. @vixentael Head of customer solutions, Security software engineer at Cossack

    Labs. I’m focused on data security and applied cryptography, building e2ee protocols, and security controls around crypto. Core maintainer of Themis cryptolib. cossacklabs.com
  2. @vixentael Things we won’t talk about Exact ciphers, symmetric vs

    asymmetric encryption. TLS. Typical cryptographic mistakes developers do. Privacy, and evil corporations. Recent data incidents and breaches. FUD. What library / tool to use to encrypt your data the best way.
  3. @vixentael What we will talk about Data security 101: encryption,

    OWASP, regulations. Cases: observations of real apps that combine encryption with supporting security controls. Encryption ways: ALE, FLE, E2EE, ZKA/ZT.
  4. @vixentael Your app’s data is everywhere. (apps, public clouds, databases,

    backups, 3rd APIs, analytics, etc). No perimeters, no trusted zones anymore.
  5. @vixentael Your app’s data is everywhere. (apps, public clouds, databases,

    backups, 3rd APIs, analytics, etc). No perimeters, no trusted zones anymore. Data security measures become security boundary for data. It's not about "protect the data where it's stored". It’s “protect the data whenever it exists”.
  6. @vixentael Data security depends on a data fl ow gathering

    secure processing secure storage and backups secure disclosure data removal data migration Gathering Processing Output logging, analytics
  7. @vixentael Data security depends on a data fl ow gathering

    secure processing secure storage and backups secure disclosure data removal data migration Gathering Processing Output logging, analytics leakage, loss, disclosure never removed, disclosure, loss gathering without consent
  8. @vixentael Data security 101 1. Identify sensitive data, understand sensitive

    data lifecycle, classify data. 2. Identify risks to data. 3. Build trust models, understand risk impact. 4. Prioritize threat vectors. 5. Select and implement proper security controls for exploitable high risk vectors (to prevent risks and to identify leaks).
  9. @vixentael Data security 101 1. Identify sensitive data, understand sensitive

    data lifecycle, classify data. 2. Identify risks to data. 3. Build trust models, understand risk impact. 4. Prioritize threat vectors. 5. Select and implement proper security controls for exploitable high risk vectors (to prevent risks and to identify leaks).
  10. @vixentael Encryption is an ultimate data security measure 1. When

    data is properly encrypted, it can’t be suddenly, unnoticeably decrypted. 2. Even if leaked, data is encrypted. Encryption is the best access control. 3. Protections against insiders & outsiders. 4. Properly con fi gured encryption allows mistakes in other security controls. 5. Regulations, compliance.
  11. @vixentael Regulations, compliance gdpr-info.eu/issues/encryption/ GDPR art 32/35: responsibly store and

    process data according to risks. GDPR art 33/34: detecting data leakage and alert users & controller.
  12. @vixentael OWASP Top10 2021 A01:2021-Broken Access Control. A02:2021-Cryptographic Failures. A03:2021-Injection.

    A04:2021-Insecure Design. A05:2021-Security Miscon fi guration. A06:2021-Vulnerable and Outdated Components. A07:2021-Identi fi cation and Authentication Failures. A08:2021-Software and Data Integrity Failures. A09:2021-Security Logging and Monitoring Failures. A10:2021-Server-Side Request Forgery. owasp.org/Top10/
  13. @vixentael A02:2021-Cryptographic Failures. owasp.org/Top10/A02_2021-Cryptographic_Failures/ Focused mostly on crypto usage and

    implementation. Bad ciphers: old and insecure? Wrong AES modes? AES-CBC instead of AES-GCM? Asymmetric encr where symmetric should be used? Bad keys: short, low entropy keys? Math.random? User password used as encryption keys without a proper KDF? Bad KDF choice / params? Unsuitable crypto-primitives choice: MD5 instead of Argon2DI, AES- OFB instead of GCM. SHA-256 instead of HMAC-SHA256. Home-brewed crypto.
  14. @vixentael A04:2021-Insecure Design. owasp.org/Top10/A04_2021-Insecure_Design/ Focused on design, missing or wrong

    security controls. Bad key management: storing encryption keys in plaintext together with data. Lack of rotation, revocation, expiration? Components are trusted when they shouldn’t be? One encryption key for everything? Lack of PKI? Lack of encryption for sensitive assets. Home-brewed encryption protocols. Encryption is not supported by authN, access control, logging & monitoring.
  15. @vixentael Encryption Data stored encrypted locally – data-at-rest encryption; also

    FS/OS encryption, database encryption. host OS / server app host OS / server app Transport layer encryption – data-in-transit encryption (TLS, IPSec, SSH). host OS / server app
  16. @vixentael Application-level encryption (ALE) Encryption process happening within application context,

    triggered by an application. ALE could work together with data-at-rest encryption and data- in-transit encryption. ALE could be client-side, server-side, end-to-end, etc. infoq.com/articles/ale-software-architects/ Encryption is easy, key management is hard.
  17. @vixentael TLS (in transit) application-level encryption server 1 server 2

    server 3 Alice Carol Bob server 1 server 2 server 3 Alice Carol Bob encrypted encrypted infoq.com/articles/ale-software-architects/
  18. @vixentael Application-level encryption data encrypted by any app – application-

    level encryption (ALE) app ALE happens on a client side – client-side encryption client ALE happens on a server side – server-side encryption server proxy … proxy-side encryption infoq.com/articles/ale-software-architects/
  19. @vixentael Field-level encryption Only some data fi elds are encrypted

    – fi eld-level encryption (FLE). { "name": base64_str(encrypted_name), 
 "phone": base64_str(encrypted_phone), 
 "passport": base64_str(encrypted_passport), "ID": user_ID, "last_activity_date": timestamp, ... }
  20. @vixentael End-to-end encryption Alice App-side encryption when no keys/ secrets/data

    is available to the intermediate infrastructure – end-to-end encryption. Bob speakerdeck.com/vixentael/e2ee-equals-security-equals-privacy Encryption should work on all selected platforms. Key management is tricky – backend should work only as key discovery service without access to private/secret keys. Complicated to design, easy to maintain, hard to debug.
  21. @vixentael encryption controls / events transit (TLS) disk / FS

    TDE / DB encryption ALE E2EE physical access to servers ⛔ ✅ ✅ ✅ ✅ MitM ✅ ⛔ ⛔ ✅ ✅ privileged DB access ⛔ ⛔ Depends ✅ ✅ privileged system access ⛔ ⛔ ⛔ Depends ✅ backups, logs, snapshots ⛔ ⛔ Depends ✅ ✅ infoq.com/articles/ale-software-architects/
  22. @vixentael If E2EE is so great, why we don’t use

    it everywhere? TLS FS/OS encr, TDE custom data-at- rest encryption ALE E2EE security efforts, tradeoffs key storage, key rotation, key revocation, data re-encryption, consistency, backups, tying keys w/ identity, search in encrypted data, logging monitoring, and all the NIST SP 800-57, 800-53.
  23. @vixentael Zero Trust / Zero Trust Architecture – assumes there

    is no implicit trust granted to assets or user accounts based solely on their physical or network location. No asset is inherently trusted. nist.gov/publications/zero-trust-architecture ZT is more about access control and authN than encryption.
  24. @vixentael Zero Knowledge Architecture (ZKA) – system where no one

    has access to unencrypted data, except the user (node, service, person). Also known as “No Knowledge” Systems. Typically built on E2EE + strong authN + privacy-respectful design. See also: ZKP, SMP, PAKE, OPAQUE; FHE, searchable encryption. cossacklabs.com/solutions/e2ee-zero-trust/
  25. @vixentael Searchable encryption Perform queries on encrypted data without decryption.

    Different schemes are possible: SSE, PEKS, blind index, (F)HE, etc. cossacklabs.com/blog/secure-search-over-encrypted-data-acra-se/ eprint.iacr.org/2019/806.pdf Most realistic: keyword search (blind index).
  26. @vixentael Other exciting crypto terms Privacy-enhancing cryptography: SMPC, PSI, PIR,

    FHE, PAKE, OPAQUE. ZK: ZKP, zk-SNARKs, zk-STARKs, zk-SNORKs :) nist.gov/blogs/cybersecurity-insights/privacy-enhancing-cryptography-complement-differential-privacy Crypto reinforced guarantees in data structures: blockchain, Merkle-tree. PQC. hackernoon.com/eli5-zero-knowledge-proof-78a276db9eff cossacklabs.com/blog/crypto-signed-audit-logs.html aumasson.jp/data/talks/quantum-poc-2021.pdf blog.cloud fl are.com/opaque-oblivious-passwords/ blog.cloud fl are.com/the-tls-post-quantum-experiment/
  27. AAA WAF honey pots IDS infra mngmt compartmentalization access logging

    jailbans monitoring data fi rewall SIEM HIDS DAST SAST HSM PKI TPM honey tokens dependency mngmt UEBA IAM TLS TDE @vixentael API protection obfuscation anti-RE csrc.nist.gov/publications/detail/sp/800-53/rev-5/ fi nal RTFM
  28. @vixentael Security controls to support crypto 1. Use encryption to

    protect sensitive data globally during the whole data fl ow. 2. Whatever is the attack vector, there is a defense layer. 3. For most popular attack vectors, set up as many independent and overlapping defenses as possible. ✅ ✅ ✅
  29. @vixentael Who we are and what we want Huge B2B

    SaaS platforms. Protect from insiders, provide transparency, detect malicious users. Encrypt data per customer, using different keys, BYOK. Minimize lifecycle of plaintext data – use encryption as early as possible to the data generation point.
  30. application @vixentael Client-side fi eld-level encryption MongoDB MongoDB SDK MongoDB

    stores records with encrypted fi elds encryption / decryption TLS writes records with encrypted fi elds reads records with encrypted fi elds TLS docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide key vault, KMS
  31. @vixentael Pros & Cons docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide Extremely useful when

    you have MongoDB. Client apps shares responsibility for en/decryption and key management due to exposure to key material. Deterministic & non-deterministic encryptions available. Support for different encryption keys (DEK).
  32. application @vixentael Proxy-side fi eld-level encryption Acra github.com/cossacklabs/acra Database stores

    records with encrypted fi elds writes records with encrypted fi elds reads records with encrypted fi elds Acra proxy encryption / decryption TLS TLS TLS key vault, KMS
  33. @vixentael Key hierarchy Database fi eld encrypted by DEK, DEK

    encrypted by KEK Key Vault KEK, encrypted by CMK KMS CMK millions dozens 1 github.com/cossacklabs/acra
  34. @vixentael Pros & Cons Neither database, not application doesn’t know

    that the data is encrypted. Proxy app is responsible for en/decryption. Easy to scale and build DAO-based architectures. Non-deterministic and searchable encryptions available. Support for different encryption keys, BYOK, key rotation, revocation, etc. Easy-to-armor a single encryption layer with speci fi c controls: DLP, anomaly detection, fi rewaling, anonymisation, etc. github.com/cossacklabs/acra
  35. @vixentael ALE for NoCode platform API frontend database DAO encryption

    integration API frontend database DAO encryption integration customer’s pod NoCode platform tech logs, analytics ... ...
  36. @vixentael ALE for NoCode platform MongoDB key vault, KMS DAO

    + ALE encryption module API frontend fi elds are encrypted + TLS fi elds are plaintext + TLS
  37. @vixentael Crypto + supporting controls 1. Key management, separate key

    per customer (BYOK). 2. Full compartmentalization: customer’s data is located in different DBs, encrypted by different key, each app uses its own DAO. 3. Full transparency — the platform doesn’t have access to customer’s data. 1. Logging, monitoring. 2. Alerting on suspicious activity, fi rewaling. 3. ASVS: API protection, anti throttling, anti-fraud, access control …
  38. @vixentael ALE for fi ntech platform Service1 PostgreSQL encryption /

    decryption layer Service2 BI Analytics ServiceN ... DAO1 DAON load balancing ... MySQL key vault, KMS ... fi elds are encrypted + TLS fi elds are plaintext + TLS
  39. @vixentael Crypto + supporting controls 1. Key management, separate keys

    per domain. 2. Decryption & anonymisation of data for BI software. 3. Isolation of sensitive data from non-sensitive. 1. Logging, monitoring. 2. PCI DSS audit logging + crypto-veri fi able logging. 3. Alerting on suspicious activity. 4. AppSec measures for DAO.
  40. @vixentael Who we are and what we want AI/ML-driven product

    with unique IP. Paid feature. Server-side generates ML models, mobile-side executes them. ML models are unique per user, per app, per request (IML). Protecting them is crucial. leakage of IP loss of IP, competitor advantage, investments into updating ML model. Losing 1 IML is not a problem, losing many IML is. broken apps, clones apps, API fraud abuse of infrastructure, revenue loss, abuse of IP, competitor advantage, reputation risks
  41. API @vixentael IML data fl ow user data GCE worker,

    TF native iOS app native Android app GCP, storing IMLs training servers main ML infra generating IMLs
  42. @vixentael Encryption layer API GCE worker, TF native iOS app

    native Android app GCP, storing IMLs encrypts each IML stores encrypted decrypts IML decrypts IML
  43. @vixentael Encryption scheme GCE worker, TF IML encryptedIML generation encryption

    storage transfer + TLS transfer + TLS decryption re-encryption & storage execution encryptedIML encryptedIML IML IML encryptedIML IMLs are encrypted after generation using ALE, unique keys per each encryption. Transmitted using TLS. Then re-encrypted on device. github.com/cossacklabs/themis
  44. @vixentael IML encryption & decryption GCE worker, TF 1. Generate

    keypair. Send app.publicKey to backend. 2. Generate keypair. Use server.privateKey and app.publicKey to derive sharedKey (ECDH). 3. Generate random DEK. 4. Encrypt IML using DEK, AES-256-GCM. 5. Encrypt DEK using sharedKey, AES-256-GCM. 6. Send { encryptedIML, encryptedDEK, server.publicKey }. 7. Receive. Use app.privateKey and server.publicKey to derive sharedKey. 8. Decrypt DEK, decrypt IML.
  45. @vixentael IML format { "data": base64_str(encrypted_IML), "key": base64_str(encrypted_DEK), "public_key": server_ephemeral_public_key,

    "version": MODEL_VERSION, "layers": { // additional ML layers encryption } } speakerdeck.com/vixentael/cryptographic-protection-of-ml-models
  46. @vixentael Hybrid Public Key Encryption (HPKE) datatracker.ietf.org/doc/draft-irtf-cfrg-hpke/ encrypt data with

    symmetric key using AEAD; encapsulate symmetric key with public key scheme RFC describes approach used before and implies standardization.
  47. @vixentael Supporting security controls API native iOS app native Android

    app GCP, storing IMLs AuthN AuthN TTL crypto anti- RE AppSec crypto anti- RE AppSec API sec anti- fraud crypto crypto ACL logging monitoring AppSec GCE worker, TF
  48. @vixentael Cloud storage security 101 1. IMLs are stored min

    time – apps are expected to grab their IML quickly. 2. URL TTL (expire after mins). 3. URL authentication & access control. 4. Clean up IML fi les (every hour). 5. Do not backup IMLs. 6. URLs are not logged. 7. Monitoring of access errors. (also see OWASP WSTG-CONF-11) AuthN TTL ACL
  49. @vixentael API protection 101 1. User authN, IMLs are available

    only after successful authN. 2. API limits, requests throttling, fi rewalling. 3. IML request limits – after N model requests, server returns error. (also see OWASP ASVS :) ) API AuthN AppSec API sec
  50. @vixentael Anti-fraud system 201 1. Limit access to IML based

    on user behaviour. 2. Gather events from mobile apps and from server side. 3. Calculate user scoring based on events (“stop- factors”, rules). 4. User scoring: OK, suspicious, malicious. 5. Block malicious, limit suspicious. anti- fraud API
  51. @vixentael Anti-fraud system 201 JB detected same public key, different

    device invalid app signature remote device attestation failed 🛑 stop factors } URL download failure app reinstall too many requests keychain not accessible 🤨 implicative rules } wrong API version … honey token deviceID … malicious suspicious OK
  52. @vixentael Remote device attestation developer.apple.com/ documentation/devicecheck Apple DeviceCheck developer.android.com/training/ safetynet/attestation

    Android SafetyNet 1. Use as part of user authN. 2. Use as source for anti-fraud system. 3. Block apps installed not from stores.
  53. @vixentael Who we are and what we want CRDT-based mobile-

    fi rst product. Users create shared spaces and collaborate on visuals and texts together. Encrypt users’ data but allow them to collaborate. speakerdeck.com/ept/adapting-secure-group-messaging-for-encrypted-crdts Martin Kleppmann discussed other approaches 1 2 3 4 1 2 5 1 2 3 4 5
  54. @vixentael CRDT log encryption strategy 1. The main problem –

    how to reduce problem to a typical one. 2. We selected document-based encryption, not chat-based. 3. Encrypt payload or action + payload. Trade-off: the more server knows the faster are merges; the less server knows – the better security. 4. Use the same encryption key for all entries of the document. 5. Tricky part: “invite” and “revoke” users: • give users access to the Document Key (“invite”) by encrypting it for each user. • de fi ne key rotation period. • pre-keying, double ratchet – overkill.
  55. passphrase encryption hint encryption zeroing secrets secure key sharing auto-locking

    timer failed attempts counter encrypted user settings log entries protection (e2ee) obfuscation anti-RE & anti-debugging good TLS @vixentael authZ / authN
  56. Encryption is not that hard. Key management is a bit

    harder. Crypto + key management + data fl ow control + security controls… Welcome to the real world :)
  57. @vixentael Don’t hesitate to talk to me if you have

    questions about data security and cryptography. Esp E2EE. vixentael.dev cossacklabs.com