Slide 1

Slide 1 text

Data Engineer, CJ Express. Women Techmakers Ambassador Protect sensitive data in pipeline with Tink Burasakorn Sabyeying (Mils)

Slide 2

Slide 2 text

Why

Slide 3

Slide 3 text

What is data encryption ? https://info.townsendsecurity.com/definitive-guide-to-encryption-key-m anagement-fundamentals Encryption at rest Encryption in transit

Slide 4

Slide 4 text

Our Data Pipeline

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Team Requirement 1. Encrypt sensitive data before sending to storage 2. Able to decrypt data in BigQuery for specific role/people

Slide 7

Slide 7 text

Before After

Slide 8

Slide 8 text

How can we apply encryption to data pipeline?

Slide 9

Slide 9 text

Cloud Storage always encrypts your data on the server side, before it is written to disk, at no additional charge (default way)

Slide 10

Slide 10 text

2 additional encryption options here: 1. Server-side encryption: encryption that occurs after Cloud Storage receives your data, but before the data is written to disk and stored. E.g. CMEK, CSEK 2. Client-side encryption: encryption that occurs before Cloud Storage receives your data

Slide 11

Slide 11 text

You can use Google's open source cryptographic SDK, Tink, to perform client-side encryption, then protect your keys with Cloud Key Management Service.

Slide 12

Slide 12 text

What is Tink ?

Slide 13

Slide 13 text

Open-source cryptography library written by cryptographers and security engineers at Google 1. Making it unnecessary for every team at Google to independently develop their own cryptography. 2. Tink also supports encrypting or storing keys in Amazon KMS, Google Cloud KMS, Android Keystore, and iOS Keychain 3. You can build Tink from source or use language-specific packages. e.g. C++, Go, Java, ObjC, Python What is Tink ?

Slide 14

Slide 14 text

Generated key by Tink using AES256 GCM Also wrapped by key from Cloud KMS

Slide 15

Slide 15 text

Cloud KMS = create and manage encryption keys for use in compatible Google Cloud services and in your own applications. key URI points: gcp-kms://projects//locations//keyRings//cryptoKeys//cryptoKeyV ersions/ "gcp-kms://projects/mils-project-2023/locations/asia-southeast1/keyRings/key-ring-1/cryptoKeys/key-name"

Slide 16

Slide 16 text

Envelope Encryption

Slide 17

Slide 17 text

Envelope encryption = process of encrypting a key with another key 1. Data encryption keys (DEK) The key used to encrypt data itself 2. Key encryption keys (KEK) The DEK is encrypted/wrapped by a key encryption key (KEK)

Slide 18

Slide 18 text

Encryption + Decryption

Slide 19

Slide 19 text

Some Best Practices 1. Store the DEK near the data that it encrypts. 2. You don't need to rotate the DEK but rotate KEK regularly 3. Do not use the same DEK to encrypt data from two different users. 4. Use a strong algorithm such as 256-bit Advanced Encryption Standard (AES) in Galois Counter Mode (GCM). 5. Store KEKs centrally

Slide 20

Slide 20 text

Benefits 1. You need both keys to decrypt data 2. Symmetric key algorithms work faster 3. Cloud KMS was designed to manage KEKs 4. A single KEK can be wrapped all DEKs; so individual data objects have their own DEK without massively increasing volume of keys stored in a central KMS Role: Cloud KMS CryptoKey Encrypter/Decrypter Via Delegation**

Slide 21

Slide 21 text

Can we use only Cloud KMS?

Slide 22

Slide 22 text

We want to decrypt data directly from BigQuery AEAD encryption functions

Slide 23

Slide 23 text

Protect sensitive data in pipeline with Tink (and Cloud KMS as Envelope Encryption)

Slide 24

Slide 24 text

Demo

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Key Takeaway 1. Envelope Encryption with Tink and Cloud KMS/ Client-side encryption with Tink and Cloud KMS 2. Tink = python, open source, easy to use 3. Able to work with BigQuery AEAD Encryption Function

Slide 27

Slide 27 text

Further Reading 1. Client-side encryption with Tink and Cloud KMS https://cloud.google.com/kms/docs/client-side-encryption 2. การทํา Data Encryptionด้วย Cloud KMS + Tink (และการใช้งานร่วมกับ BigQuery Encryption Functions) by Jamie https://medium.com/cj-express-tech-tildi/81590305b314 3. เข้ารหัสข้อมูลให้ปลอดภัยด้วย Envelope Encryption by Manatsawin https://life.wongnai.com/envelope-encryption-f93837e5309f

Slide 28

Slide 28 text

Mesodiar.com (medium) Mils’ Blog (FB page) More tech blogs !

Slide 29

Slide 29 text

Join the Women Techmakers Members community today! + Access exclusive members-only content + Sneak peaks at upcoming events + Connect with other women like yourself bit.ly/wtmmembership