Slide 1

Slide 1 text

Role Based Access Control in Real-Time Streaming Data: What, Why and How Hojjat Jafarpour Founder & CEO @ DeltaStream, Inc. [email protected] @hojjat

Slide 2

Slide 2 text

Streaming Data is Everywhere 2 … IoT InfoSec Machine Generated Data Gaming Mobile ML/AI

Slide 3

Slide 3 text

Streaming Storage 3 ● Apache Kafka, AWS Kinesis, Apache Pulsar,... ● Backbone of streaming data ● Decouple producers and consumers ● Make data available in real-time (low latency) Streaming Storage (Kafka, Kinesis, …)

Slide 4

Slide 4 text

Access Control ● Access Control Lists (ACLs) ○ Good for smaller scale 4

Slide 5

Slide 5 text

Access Control ● Access Control Lists (ACLs) ○ Good for smaller scale ● Role-based Access Control (RBAC) ○ Access control privileges ○ Roles ○ Resources ○ Privileges determine which role can access and perform operations on specific resources. 5

Slide 6

Slide 6 text

Example1: Confluent Cloud ● Allows access control to ○ Organization, environment, cluster, granular Kafka resources(topics, consumer groups, and transactional IDs), SR and ksqlDB resources https://docs.confluent.io/cloud/current/access-management/access-control/cloud-rbac.html 6

Slide 7

Slide 7 text

Example1: Confluent Cloud ● Allows access control to ○ Organization, environment, cluster, granular Kafka resources(topics, consumer groups, and transactional IDs), SR and ksqlDB resources ● Roles are predefined ○ As of now there are 13 available roles ■ OrganizationAdmin, EnvironmentAdmin, CloudClusterAdmin, Operator,... https://docs.confluent.io/cloud/current/access-management/access-control/cloud-rbac.html 7

Slide 8

Slide 8 text

Example1: Confluent Cloud ● Allows access control to ○ Organization, environment, cluster, granular Kafka resources(topics, consumer groups, and transactional IDs), SR and ksqlDB resources ● Roles are predefined ○ As of now there are 13 available roles ■ OrganizationAdmin, EnvironmentAdmin, CloudClusterAdmin, Operator,... ● Each role has View Scope and Admin Scope https://docs.confluent.io/cloud/current/access-management/access-control/cloud-rbac.html 8

Slide 9

Slide 9 text

Example1: Confluent Cloud ● Role bindings ○ Permissions for principals ■ Which roles are assigned to a given user ■ e.g., grant permissions to a new user 9

Slide 10

Slide 10 text

Example1: Confluent Cloud ● Role bindings ○ Permissions for principals ■ Which roles are assigned to a given user ■ e.g., grant permissions to a new user ○ Permissions on resources ■ Which roles can access to a given resource ■ e.g., grant permissions on a resource such as a new cluster to roles 10

Slide 11

Slide 11 text

Example1: Confluent Cloud ● Example role bindings confluent iam rbac role-binding create --principal User:u-a03bcd --role CloudClusterAdmin --environment env-nx5jd --cloud-cluster lkc-xyxmz +--------------+-------------------+ | Principal | User:u-a03bcd | | Role | CloudClusterAdmin | | ResourceType | Cluster | +--------------+-------------------+ 11

Slide 12

Slide 12 text

Example1: Confluent Cloud ● Example role bindings confluent iam rbac role-binding create --principal User:u-e03vqq --role ResourceOwner \ --environment env-nx5jd --cloud-cluster lkc-xyxmz --kafka-cluster-id lkc-xyxmz \ --resource Topic:connect-config +----------------+----------------+ | Principal | User:u-e03vqq | | Email | | | Role | ResourceOwner | | Environment | | | CloudCluster | | | ClusterType | | | LogicalCluster | | | ResourceType | Topic | | Name | connect-config | | PatternType | LITERAL | +----------------+----------------+ 12

Slide 13

Slide 13 text

Example1: Confluent Cloud ● Need to use Confluent Cloud Console, CLI or API ● Confluent only commands ● Limited roles available 13

Slide 14

Slide 14 text

Example2: AWS MSK ● AWS MSK uses AWS Identity and Access Management(IAM) for authentication and authorization 14

Slide 15

Slide 15 text

Example2: AWS MSK ● AWS MSK uses AWS Identity and Access Management(IAM) for authentication and authorization ● Uses Authorization Policies ○ Specifies which actions to allow or deny on a resource for a role 15

Slide 16

Slide 16 text

Example2: AWS MSK ● There is a list of available actions ○ kafka-cluster:Connect, kafka-cluster:DescribeCluster,.. ○ Some actions depend on others ■ Actions with dependencies should include those dependencies 16

Slide 17

Slide 17 text

Example2: AWS MSK ● There is a list of available actions ○ kafka-cluster:Connect, kafka-cluster:DescribeCluster,.. ○ Some actions depend on others ■ Actions with dependencies should include those dependencies ● Four types of resources that can be used in authorization policy ○ Cluster, Topic, Group and Transactional ID 17

Slide 18

Slide 18 text

Example2: AWS MSK 18 { "Version": "2022-12-16", "Statement": [ { "Effect": "Allow", "Action": [ "kafka-cluster:Connect", "kafka-cluster:AlterCluster", "kafka-cluster:DescribeCluster" ], "Resource": [ "arn:aws:kafka:us-east-1:0123456789012:cluster/MyTestCluster/abcd1234-0123-abcd-5678-1234abcd-1" ] } ] }

Slide 19

Slide 19 text

Example2: AWS MSK ● Need to use AWS IAM ● AWS IAM requests only ● Can be used through AWS Management Console, the API, or the AWS CLI 19

Slide 20

Slide 20 text

20 Is there an easier way to have RBAC for streaming data?

Slide 21

Slide 21 text

RBAC in Relational Systems 21

Slide 22

Slide 22 text

RBAC in Relational Systems ● The most familiar for data at rest 22

Slide 23

Slide 23 text

RBAC in Relational Systems ● The most familiar for data at rest ● Widely used and large user base ○ From open source relational databases such as Postgres to commercial cloud data warehouses 23

Slide 24

Slide 24 text

RBAC in Relational Systems ● The most familiar for data at rest ● Widely used and large user base ○ From open source relational databases such as Postgres to commercial cloud data warehouses ● Why not use the same model for data in motion(streaming data)? 24

Slide 25

Slide 25 text

Relational Model for Streams 25

Slide 26

Slide 26 text

Relational Model for Streams 26 Relational model on streaming data

Slide 27

Slide 27 text

Relational Model for Streams 27 Relational model on streaming data Organize Process Secure

Slide 28

Slide 28 text

Example 28 CC_1 MSK_1 orders customers ● Two Kafka clusters, one on Confluent Cloud(CC_1) and one on AWS(MSK_1) ● Two topics, orders and customers

Slide 29

Slide 29 text

Hierarchical Namespacing ● Same as relational systems ○ Relation: static or dynamic set of tuples ■ STREAM ■ CHANGELOG ■ MATERIALIZED VIEW ■ TABLE 29

Slide 30

Slide 30 text

Hierarchical Namespacing ● Same as relational systems ○ Relation: static or dynamic set of tuples ■ STREAM ■ CHANGELOG ■ MATERIALIZED VIEW ■ TABLE ○ Schema: a logical grouping of relational objects such as Streams, Changelogs, Materialized Views, and Tables. 30

Slide 31

Slide 31 text

Hierarchical Namespacing ● Same as relational systems ○ Relation: static or dynamic set of tuples ■ STREAM ■ CHANGELOG ■ MATERIALIZED VIEW ■ TABLE ○ Schema: a logical grouping of relational objects such as Streams, Changelogs, Materialized Views, and Tables. ○ Database: a logical grouping of schemas. 31

Slide 32

Slide 32 text

Example 32 online_store(Database) public(Schema) orders(Stream) customers(Changelog)

Slide 33

Slide 33 text

Example 33 CC_1 MSK_1 orders customers ● Run continuous queries and build materialized views ○ Create enriched_orders by joining orders with customers. ○ Build a materialized view to compute daily order value per customer enriched_orders

Slide 34

Slide 34 text

Example 34 online_store(Database) public(Schema) orders(Stream) customers(Changelog) enriched_orders(Stream) daily_spend(Materialized View)

Slide 35

Slide 35 text

Key Concepts ● User: represents an authenticated identity. 35

Slide 36

Slide 36 text

Key Concepts ● User: represents an authenticated identity. ● Objects (Securable Objects): represents an entity to which access can be granted. 36

Slide 37

Slide 37 text

Key Concepts ● User: represents an authenticated identity. ● Objects (Securable Objects): represents an entity to which access can be granted. ● Privilege: is a defined level of access to an object. 37

Slide 38

Slide 38 text

Key Concepts ● User: represents an authenticated identity. ● Objects (Securable Objects): represents an entity to which access can be granted. ● Privilege: is a defined level of access to an object. ● Role: is an entity to which privileges can be granted. 38

Slide 39

Slide 39 text

RBAC for Streaming Data ● Use the same SQL syntax as the relational systems ○ CREATE/DROP/ALTER ○ GRANT/REVOKE 39

Slide 40

Slide 40 text

Example 40 ● Create a new role to access the enriched_orders stream and daily_spent materialized view CREATE ROLE test_role_1; GRANT ROLE test_role_1 TO USER user_1; GRANT SELECT ON RELATION online_store.public.enriched_orders TO ROLE test_role_1; GRANT SELECT ON RELATION online_store.public.daily_spent TO ROLE test_role_1;

Slide 41

Slide 41 text

Example 41 ● Create a new role to access all objects in online_store database CREATE ROLE test_role_2; GRANT ROLE test_role_2 TO USER user_2; GRANT USAGE ON DATABASE online_store TO ROLE test_role_2;

Slide 42

Slide 42 text

Relational RBAC for Streaming Data ● Works across streaming stores ○ Apache Kafka, AWS Kinesis, … 42

Slide 43

Slide 43 text

Relational RBAC for Streaming Data ● Works across streaming stores ○ Apache Kafka, AWS Kinesis, … ● Familiar syntax ○ No need to learn new syntax/concepts 43

Slide 44

Slide 44 text

Relational RBAC for Streaming Data ● Works across streaming stores ○ Apache Kafka, AWS Kinesis, … ● Familiar syntax ○ No need to learn new syntax/concepts ● Hierarchical namespacing 44

Slide 45

Slide 45 text

Relational RBAC for Streaming Data ● Works across streaming stores ○ Apache Kafka, AWS Kinesis, … ● Familiar syntax ○ No need to learn new syntax/concepts ● Hierarchical namespacing ● Unified view of all streaming data by abstracting the streaming stores 45

Slide 46

Slide 46 text

Relational RBAC for Streaming Data ● Works across streaming stores ○ Apache Kafka, AWS Kinesis, … ● Familiar syntax ○ No need to learn new syntax/concepts ● Hierarchical namespacing ● Unified view of all streaming data by abstracting the streaming stores ● No limitations on roles ○ Can build as many custom roles as needed 46

Slide 47

Slide 47 text

We have built exactly this at 47 Try all of these and much more for free at www.deltastream.io