Slide 1

Slide 1 text

Data is a new security boundary @vixentael

Slide 2

Slide 2 text

@vixentael Head of customer solutions, Security software engineer at Cossack Labs. I’m focused on data security and applied cryptography, building e2ee protocols, and security controls around crypto. Core maintainer of Themis cryptolib. cossacklabs.com

Slide 3

Slide 3 text

@vixentael Things we won’t talk about Exact ciphers, symmetric vs asymmetric encryption. TLS. Typical cryptographic mistakes developers do. Privacy, and evil corporations. Recent data incidents and breaches. FUD. What library / tool to use to encrypt your data the best way.

Slide 4

Slide 4 text

@vixentael What we will talk about Data security 101: encryption, OWASP, regulations. Cases: observations of real apps that combine encryption with supporting security controls. Encryption ways: ALE, FLE, E2EE, ZKA/ZT.

Slide 5

Slide 5 text

Data security 101

Slide 6

Slide 6 text

@vixentael Your app’s data is everywhere. (apps, public clouds, databases, backups, 3rd APIs, analytics, etc). No perimeters, no trusted zones anymore.

Slide 7

Slide 7 text

@vixentael

Slide 8

Slide 8 text

@vixentael Your app’s data is everywhere. (apps, public clouds, databases, backups, 3rd APIs, analytics, etc). No perimeters, no trusted zones anymore. Data security measures become security boundary for data. It's not about "protect the data where it's stored". It’s “protect the data whenever it exists”.

Slide 9

Slide 9 text

@vixentael Data security depends on a data fl ow gathering secure processing secure storage and backups secure disclosure data removal data migration Gathering Processing Output logging, analytics

Slide 10

Slide 10 text

@vixentael Data security depends on a data fl ow gathering secure processing secure storage and backups secure disclosure data removal data migration Gathering Processing Output logging, analytics leakage, loss, disclosure never removed, disclosure, loss gathering without consent

Slide 11

Slide 11 text

@vixentael Data security 101 1. Identify sensitive data, understand sensitive data lifecycle, classify data. 2. Identify risks to data. 3. Build trust models, understand risk impact. 4. Prioritize threat vectors. 5. Select and implement proper security controls for exploitable high risk vectors (to prevent risks and to identify leaks).

Slide 12

Slide 12 text

@vixentael Data security 101 1. Identify sensitive data, understand sensitive data lifecycle, classify data. 2. Identify risks to data. 3. Build trust models, understand risk impact. 4. Prioritize threat vectors. 5. Select and implement proper security controls for exploitable high risk vectors (to prevent risks and to identify leaks).

Slide 13

Slide 13 text

Encryption

Slide 14

Slide 14 text

@vixentael Encryption is an ultimate data security measure 1. When data is properly encrypted, it can’t be suddenly, unnoticeably decrypted. 2. Even if leaked, data is encrypted. Encryption is the best access control. 3. Protections against insiders & outsiders. 4. Properly con fi gured encryption allows mistakes in other security controls. 5. Regulations, compliance.

Slide 15

Slide 15 text

@vixentael Regulations, compliance media.defense.gov/2018/Apr/22/2001906836/-1/-1/0/ DEFENSEINNOVATIONBOARD_TEN_COMMANDMENTS_OF_SOFTWARE_2018.04.20.PDF

Slide 16

Slide 16 text

@vixentael Regulations, compliance gdpr-info.eu/issues/encryption/ GDPR art 32/35: responsibly store and process data according to risks. GDPR art 33/34: detecting data leakage and alert users & controller.

Slide 17

Slide 17 text

@vixentael Regulations, compliance cossacklabs.com/blog/what-we-need-to-encrypt-cheatsheet.html also, HIPAA, FISMA, FIPS, FedRAMP, CCPA, PCI DSS, FERPA, and many more

Slide 18

Slide 18 text

@vixentael OWASP Top10 2021 A01:2021-Broken Access Control. A02:2021-Cryptographic Failures. A03:2021-Injection. A04:2021-Insecure Design. A05:2021-Security Miscon fi guration. A06:2021-Vulnerable and Outdated Components. A07:2021-Identi fi cation and Authentication Failures. A08:2021-Software and Data Integrity Failures. A09:2021-Security Logging and Monitoring Failures. A10:2021-Server-Side Request Forgery. owasp.org/Top10/

Slide 19

Slide 19 text

@vixentael A02:2021-Cryptographic Failures. owasp.org/Top10/A02_2021-Cryptographic_Failures/ Focused mostly on crypto usage and implementation. Bad ciphers: old and insecure? Wrong AES modes? AES-CBC instead of AES-GCM? Asymmetric encr where symmetric should be used? Bad keys: short, low entropy keys? Math.random? User password used as encryption keys without a proper KDF? Bad KDF choice / params? Unsuitable crypto-primitives choice: MD5 instead of Argon2DI, AES- OFB instead of GCM. SHA-256 instead of HMAC-SHA256. Home-brewed crypto.

Slide 20

Slide 20 text

@vixentael A04:2021-Insecure Design. owasp.org/Top10/A04_2021-Insecure_Design/ Focused on design, missing or wrong security controls. Bad key management: storing encryption keys in plaintext together with data. Lack of rotation, revocation, expiration? Components are trusted when they shouldn’t be? One encryption key for everything? Lack of PKI? Lack of encryption for sensitive assets. Home-brewed encryption protocols. Encryption is not supported by authN, access control, logging & monitoring.

Slide 21

Slide 21 text

@vixentael OWASP ASVS github.com/OWASP/ASVS V6: Stored Cryptography, V8: Data Protection, V9: Communications

Slide 22

Slide 22 text

@vixentael OWASP MASVS github.com/OWASP/owasp-masvs/ V3: Cryptography, V2: Data Storage, V5: Network Communication

Slide 23

Slide 23 text

@vixentael *ASVS

Slide 24

Slide 24 text

Encryption models and ways

Slide 25

Slide 25 text

@vixentael Encryption Data stored encrypted locally – data-at-rest encryption; also FS/OS encryption, database encryption. host OS / server app host OS / server app Transport layer encryption – data-in-transit encryption (TLS, IPSec, SSH). host OS / server app

Slide 26

Slide 26 text

@vixentael Application-level encryption (ALE) Encryption process happening within application context, triggered by an application. ALE could work together with data-at-rest encryption and data- in-transit encryption. ALE could be client-side, server-side, end-to-end, etc. infoq.com/articles/ale-software-architects/ Encryption is easy, key management is hard.

Slide 27

Slide 27 text

@vixentael TLS (in transit) application-level encryption server 1 server 2 server 3 Alice Carol Bob server 1 server 2 server 3 Alice Carol Bob encrypted encrypted infoq.com/articles/ale-software-architects/

Slide 28

Slide 28 text

@vixentael Application-level encryption data encrypted by any app – application- level encryption (ALE) app ALE happens on a client side – client-side encryption client ALE happens on a server side – server-side encryption server proxy … proxy-side encryption infoq.com/articles/ale-software-architects/

Slide 29

Slide 29 text

@vixentael Field-level encryption Only some data fi elds are encrypted – fi eld-level encryption (FLE). { "name": base64_str(encrypted_name), 
 "phone": base64_str(encrypted_phone), 
 "passport": base64_str(encrypted_passport), "ID": user_ID, "last_activity_date": timestamp, ... }

Slide 30

Slide 30 text

@vixentael End-to-end encryption Alice App-side encryption when no keys/ secrets/data is available to the intermediate infrastructure – end-to-end encryption. Bob speakerdeck.com/vixentael/e2ee-equals-security-equals-privacy Encryption should work on all selected platforms. Key management is tricky – backend should work only as key discovery service without access to private/secret keys. Complicated to design, easy to maintain, hard to debug.

Slide 31

Slide 31 text

@vixentael TLS and ALE have different threat models, it’s unfair to compare them, but we will :)

Slide 32

Slide 32 text

@vixentael encryption controls / events transit (TLS) disk / FS TDE / DB encryption ALE E2EE physical access to servers ⛔ ✅ ✅ ✅ ✅ MitM ✅ ⛔ ⛔ ✅ ✅ privileged DB access ⛔ ⛔ Depends ✅ ✅ privileged system access ⛔ ⛔ ⛔ Depends ✅ backups, logs, snapshots ⛔ ⛔ Depends ✅ ✅ infoq.com/articles/ale-software-architects/

Slide 33

Slide 33 text

@vixentael If E2EE is so great, why we don’t use it everywhere? TLS FS/OS encr, TDE custom data-at- rest encryption ALE E2EE security efforts, tradeoffs key storage, key rotation, key revocation, data re-encryption, consistency, backups, tying keys w/ identity, search in encrypted data, logging monitoring, and all the NIST SP 800-57, 800-53.

Slide 34

Slide 34 text

@vixentael Zero Trust / Zero Trust Architecture – assumes there is no implicit trust granted to assets or user accounts based solely on their physical or network location. No asset is inherently trusted. nist.gov/publications/zero-trust-architecture ZT is more about access control and authN than encryption.

Slide 35

Slide 35 text

@vixentael Zero Knowledge Architecture (ZKA) – system where no one has access to unencrypted data, except the user (node, service, person). Also known as “No Knowledge” Systems. Typically built on E2EE + strong authN + privacy-respectful design. See also: ZKP, SMP, PAKE, OPAQUE; FHE, searchable encryption. cossacklabs.com/solutions/e2ee-zero-trust/

Slide 36

Slide 36 text

@vixentael Searchable encryption Perform queries on encrypted data without decryption. Different schemes are possible: SSE, PEKS, blind index, (F)HE, etc. cossacklabs.com/blog/secure-search-over-encrypted-data-acra-se/ eprint.iacr.org/2019/806.pdf Most realistic: keyword search (blind index).

Slide 37

Slide 37 text

@vixentael *ASVS ALE, E2EE

Slide 38

Slide 38 text

@vixentael Other exciting crypto terms Privacy-enhancing cryptography: SMPC, PSI, PIR, FHE, PAKE, OPAQUE. ZK: ZKP, zk-SNARKs, zk-STARKs, zk-SNORKs :) nist.gov/blogs/cybersecurity-insights/privacy-enhancing-cryptography-complement-differential-privacy Crypto reinforced guarantees in data structures: blockchain, Merkle-tree. PQC. hackernoon.com/eli5-zero-knowledge-proof-78a276db9eff cossacklabs.com/blog/crypto-signed-audit-logs.html aumasson.jp/data/talks/quantum-poc-2021.pdf blog.cloud fl are.com/opaque-oblivious-passwords/ blog.cloud fl are.com/the-tls-post-quantum-experiment/

Slide 39

Slide 39 text

@vixentael SMPC, PIR, zk- SNARKs, PQC

Slide 40

Slide 40 text

Crypto is more useful when integrated with traditional security controls.

Slide 41

Slide 41 text

@vixentael data encryption access control, authN transport encryption access logging honeypots, SIEMs

Slide 42

Slide 42 text

@vixentael

Slide 43

Slide 43 text

AAA WAF honey pots IDS infra mngmt compartmentalization access logging jailbans monitoring data fi rewall SIEM HIDS DAST SAST HSM PKI TPM honey tokens dependency mngmt UEBA IAM TLS TDE @vixentael API protection obfuscation anti-RE csrc.nist.gov/publications/detail/sp/800-53/rev-5/ fi nal RTFM

Slide 44

Slide 44 text

@vixentael Security controls to support crypto 1. Use encryption to protect sensitive data globally during the whole data fl ow. 2. Whatever is the attack vector, there is a defense layer. 3. For most popular attack vectors, set up as many independent and overlapping defenses as possible. ✅ ✅ ✅

Slide 45

Slide 45 text

@vixentael *ASVS ALE, E2EE crypto + security controls

Slide 46

Slide 46 text

Let’s see some real-world cases

Slide 47

Slide 47 text

Field-level data encryption for SaaS platforms

Slide 48

Slide 48 text

@vixentael Who we are and what we want Huge B2B SaaS platforms. Protect from insiders, provide transparency, detect malicious users. Encrypt data per customer, using different keys, BYOK. Minimize lifecycle of plaintext data – use encryption as early as possible to the data generation point.

Slide 49

Slide 49 text

application @vixentael Client-side fi eld-level encryption MongoDB MongoDB SDK MongoDB stores records with encrypted fi elds encryption / decryption TLS writes records with encrypted fi elds reads records with encrypted fi elds TLS docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide key vault, KMS

Slide 50

Slide 50 text

@vixentael Key hierarchy docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide MongoDB fi eld, encrypted by DEK Key Vault DEK, encrypted by CMK KMS CMK millions dozens 1

Slide 51

Slide 51 text

@vixentael Pros & Cons docs.mongodb.com/drivers/security/client-side- fi eld-level-encryption-guide Extremely useful when you have MongoDB. Client apps shares responsibility for en/decryption and key management due to exposure to key material. Deterministic & non-deterministic encryptions available. Support for different encryption keys (DEK).

Slide 52

Slide 52 text

application @vixentael Proxy-side fi eld-level encryption Acra github.com/cossacklabs/acra Database stores records with encrypted fi elds writes records with encrypted fi elds reads records with encrypted fi elds Acra proxy encryption / decryption TLS TLS TLS key vault, KMS

Slide 53

Slide 53 text

@vixentael Key hierarchy Database fi eld encrypted by DEK, DEK encrypted by KEK Key Vault KEK, encrypted by CMK KMS CMK millions dozens 1 github.com/cossacklabs/acra

Slide 54

Slide 54 text

@vixentael Pros & Cons Neither database, not application doesn’t know that the data is encrypted. Proxy app is responsible for en/decryption. Easy to scale and build DAO-based architectures. Non-deterministic and searchable encryptions available. Support for different encryption keys, BYOK, key rotation, revocation, etc. Easy-to-armor a single encryption layer with speci fi c controls: DLP, anomaly detection, fi rewaling, anonymisation, etc. github.com/cossacklabs/acra

Slide 55

Slide 55 text

@vixentael ALE for NoCode platform API frontend database DAO encryption integration API frontend database DAO encryption integration customer’s pod NoCode platform tech logs, analytics ... ...

Slide 56

Slide 56 text

@vixentael ALE for NoCode platform MongoDB key vault, KMS DAO + ALE encryption module API frontend fi elds are encrypted + TLS fi elds are plaintext + TLS

Slide 57

Slide 57 text

@vixentael Crypto + supporting controls 1. Key management, separate key per customer (BYOK). 2. Full compartmentalization: customer’s data is located in different DBs, encrypted by different key, each app uses its own DAO. 3. Full transparency — the platform doesn’t have access to customer’s data. 1. Logging, monitoring. 2. Alerting on suspicious activity, fi rewaling. 3. ASVS: API protection, anti throttling, anti-fraud, access control …

Slide 58

Slide 58 text

@vixentael ALE for fi ntech platform Service1 PostgreSQL encryption / decryption layer Service2 BI Analytics ServiceN ... DAO1 DAON load balancing ... MySQL key vault, KMS ... fi elds are encrypted + TLS fi elds are plaintext + TLS

Slide 59

Slide 59 text

@vixentael Crypto + supporting controls 1. Key management, separate keys per domain. 2. Decryption & anonymisation of data for BI software. 3. Isolation of sensitive data from non-sensitive. 1. Logging, monitoring. 2. PCI DSS audit logging + crypto-veri fi able logging. 3. Alerting on suspicious activity. 4. AppSec measures for DAO.

Slide 60

Slide 60 text

Crypto-based ML models protection

Slide 61

Slide 61 text

@vixentael Who we are and what we want AI/ML-driven product with unique IP. Paid feature. Server-side generates ML models, mobile-side executes them. ML models are unique per user, per app, per request (IML). Protecting them is crucial. leakage of IP loss of IP, competitor advantage, investments into updating ML model. Losing 1 IML is not a problem, losing many IML is. broken apps, clones apps, API fraud abuse of infrastructure, revenue loss, abuse of IP, competitor advantage, reputation risks

Slide 62

Slide 62 text

API @vixentael IML data fl ow user data GCE worker, TF native iOS app native Android app GCP, storing IMLs training servers main ML infra generating IMLs

Slide 63

Slide 63 text

@vixentael Encryption layer API GCE worker, TF native iOS app native Android app GCP, storing IMLs encrypts each IML stores encrypted decrypts IML decrypts IML

Slide 64

Slide 64 text

@vixentael Encryption scheme GCE worker, TF IML encryptedIML generation encryption storage transfer + TLS transfer + TLS decryption re-encryption & storage execution encryptedIML encryptedIML IML IML encryptedIML IMLs are encrypted after generation using ALE, unique keys per each encryption. Transmitted using TLS. Then re-encrypted on device. github.com/cossacklabs/themis

Slide 65

Slide 65 text

@vixentael IML encryption & decryption GCE worker, TF 1. Generate keypair. Send app.publicKey to backend. 2. Generate keypair. Use server.privateKey and app.publicKey to derive sharedKey (ECDH). 3. Generate random DEK. 4. Encrypt IML using DEK, AES-256-GCM. 5. Encrypt DEK using sharedKey, AES-256-GCM. 6. Send { encryptedIML, encryptedDEK, server.publicKey }. 7. Receive. Use app.privateKey and server.publicKey to derive sharedKey. 8. Decrypt DEK, decrypt IML.

Slide 66

Slide 66 text

@vixentael IML format { "data": base64_str(encrypted_IML), "key": base64_str(encrypted_DEK), "public_key": server_ephemeral_public_key, "version": MODEL_VERSION, "layers": { // additional ML layers encryption } } speakerdeck.com/vixentael/cryptographic-protection-of-ml-models

Slide 67

Slide 67 text

@vixentael Hybrid Public Key Encryption (HPKE) datatracker.ietf.org/doc/draft-irtf-cfrg-hpke/ encrypt data with symmetric key using AEAD; encapsulate symmetric key with public key scheme RFC describes approach used before and implies standardization.

Slide 68

Slide 68 text

@vixentael Supporting security controls API native iOS app native Android app GCP, storing IMLs AuthN AuthN TTL crypto anti- RE AppSec crypto anti- RE AppSec API sec anti- fraud crypto crypto ACL logging monitoring AppSec GCE worker, TF

Slide 69

Slide 69 text

@vixentael Cloud storage security 101 1. IMLs are stored min time – apps are expected to grab their IML quickly. 2. URL TTL (expire after mins). 3. URL authentication & access control. 4. Clean up IML fi les (every hour). 5. Do not backup IMLs. 6. URLs are not logged. 7. Monitoring of access errors. (also see OWASP WSTG-CONF-11) AuthN TTL ACL

Slide 70

Slide 70 text

@vixentael API protection 101 1. User authN, IMLs are available only after successful authN. 2. API limits, requests throttling, fi rewalling. 3. IML request limits – after N model requests, server returns error. (also see OWASP ASVS :) ) API AuthN AppSec API sec

Slide 71

Slide 71 text

@vixentael Anti-fraud system 201 1. Limit access to IML based on user behaviour. 2. Gather events from mobile apps and from server side. 3. Calculate user scoring based on events (“stop- factors”, rules). 4. User scoring: OK, suspicious, malicious. 5. Block malicious, limit suspicious. anti- fraud API

Slide 72

Slide 72 text

@vixentael Anti-fraud system 201 JB detected same public key, different device invalid app signature remote device attestation failed 🛑 stop factors } URL download failure app reinstall too many requests keychain not accessible 🤨 implicative rules } wrong API version … honey token deviceID … malicious suspicious OK

Slide 73

Slide 73 text

@vixentael Remote device attestation developer.apple.com/ documentation/devicecheck Apple DeviceCheck developer.android.com/training/ safetynet/attestation Android SafetyNet 1. Use as part of user authN. 2. Use as source for anti-fraud system. 3. Block apps installed not from stores.

Slide 74

Slide 74 text

@vixentael Read the full story! speakerdeck.com/vixentael/cryptographic-protection-of-ml-models Cryptographic protection of ML models

Slide 75

Slide 75 text

CRDT & E2EE

Slide 76

Slide 76 text

@vixentael Who we are and what we want CRDT-based mobile- fi rst product. Users create shared spaces and collaborate on visuals and texts together. Encrypt users’ data but allow them to collaborate. speakerdeck.com/ept/adapting-secure-group-messaging-for-encrypted-crdts Martin Kleppmann discussed other approaches 1 2 3 4 1 2 5 1 2 3 4 5

Slide 77

Slide 77 text

@vixentael CRDT log encryption strategy 1. The main problem – how to reduce problem to a typical one. 2. We selected document-based encryption, not chat-based. 3. Encrypt payload or action + payload. Trade-off: the more server knows the faster are merges; the less server knows – the better security. 4. Use the same encryption key for all entries of the document. 5. Tricky part: “invite” and “revoke” users: • give users access to the Document Key (“invite”) by encrypting it for each user. • de fi ne key rotation period. • pre-keying, double ratchet – overkill.

Slide 78

Slide 78 text

log entries protection (e2ee) @vixentael

Slide 79

Slide 79 text

passphrase encryption hint encryption zeroing secrets secure key sharing auto-locking timer failed attempts counter encrypted user settings log entries protection (e2ee) obfuscation anti-RE & anti-debugging good TLS @vixentael authZ / authN

Slide 80

Slide 80 text

@vixentael *ASVS ALE, E2EE crypto + security controls

Slide 81

Slide 81 text

Encryption is not that hard. Key management is a bit harder.

Slide 82

Slide 82 text

Encryption is not that hard. Key management is a bit harder. Crypto + key management + data fl ow control + security controls… Welcome to the real world :)

Slide 83

Slide 83 text

@vixentael Don’t hesitate to talk to me if you have questions about data security and cryptography. Esp E2EE. vixentael.dev cossacklabs.com