Risk Driven Fault Injection

Slide 1

Slide 1 text

Risk Driven Fault Injection Security Chaos Engineering for The Fast & Furious Kennedy A . Torkura

Slide 2

Slide 2 text

Security Chaos Engineering ● What is Security Chaos Engineering ○ How is diﬀers from Chaos Engineering ● Why it is important/why are we talking about it ○ Complexity ○ Increasing attacks against cloud native infrastructure ○ Ineﬃcient security countermeasures ● Cloud Native Security ○ What is it ○ Challenges ● Risk-Driven Fault Injection

Slide 3

Slide 3 text

Security Chaos Engineering Security Chaos Engineering is the identiﬁcation of security control failures through proactive experimentation to build conﬁdence in the system’s ability to defend against malicious conditions in production Aaron Rinehart, Co-Founder & CTO,Verica

Slide 4

Slide 4 text

Security Chaos Engineering Chaos Engineering ● Addresses availability problems ● Resiliency patterns ○ Timeouts ○ Bulkheads ○ Circuit breaker Security Chaos Engineering ● Addresses ○ Availability ○ Integrity ○ Conﬁdentiality ● Verify security patterns/controls ○ Preventive controls e.g. ﬁrewalls ○ Detective controls e.g. IDS ○ Corrective controls e.g. incident response systems ● AIM - detect security blind spots

Slide 5

Slide 5 text

Complexity Complexity is the worst enemy of security - Bruce Schneier

Slide 6

Slide 6 text

Increasing Cloud Attacks Cloud Native threat Report 2020 - Aqua Security Team

Slide 7

Slide 7 text

Evolving Security Challenges 99% cloud security incidents is caused by users - Gartner Why? ● Knowledge gap ● Insuﬃcient tooling support

Slide 8

Slide 8 text

Evolving Security Challenges ● Digital transformation ● DevOps ● CI/CD Traditional Security

Slide 9

Slide 9 text

Evolving Security Challenges ● Digital transformation ● DevOps ● CI/CD Modern Security

Slide 10

Slide 10 text

Cloud Native Security Cloud Native Security is about securing cloud native infrastructure The 4C’s of Cloud Native Security ● defence-in-depth https://kubernetes.io/docs/concepts/security/overview/#the-4c-s-of-cloud-native-security

Slide 11

Slide 11 text

Cloud Attack Paths

Slide 12

Slide 12 text

Cloud Attack Paths container code cloud cluster

Slide 13

Slide 13 text

Cloud Native Security Platforms Cloud Security Posture Management Cloud Access Security Brokers Cloud Workload Protection Platforms SCE

Slide 14

Slide 14 text

PLAN Apply outcome of analysis to improve security. Design and plan future security hypotheses ANALYZE Collect and analyze observations. Vulnerabilities can be ranked and prioritized MONITOR Observe and monitor the execution of security perturbations. Intervene when necessary to ensure safety EXECUTE Inject security faults based on crafted hypotheses KNOWLEDGE Security insights & information including security fault models, detected vulnerabilities & analytical outcomes Risk Driven Fault Injection ● adapted from MAPE-K Feedback loop used in autonomous computer systems SCE Feedback Loop

Slide 15

Slide 15 text

Execute ● 100% security is a dream ● Risk driven security ○ Quantitative risk assessments ○ Data driven ● Communicate security information/analysis to management and other teams ● Measure progress Risk Driven Fault Injection

Slide 16

Slide 16 text

Execute ● The aim of the experiment ● Craft a suitable hypothesis ● Determine the scope: scale, depth and intensity ● Perform sanity check ○ Coordinating with responsible teams (admin & social aspects) ○ Recoverability (IaC, Git, State Management) SCE Feedback Loop

Slide 17

Slide 17 text

Implementation ■ Modes of operation: □ Low- 30% □ Medium - 60% □ High - 90% ■ Attack scenario: chaining of multiple attack actions

Slide 18

Slide 18 text

start create user Bob get cloud buckets select random bucket create malicious policy assign policy to Bob & bucket end An example of an experiment hypothesis: cloud buckets are secure SCE Attack Scenario

Slide 19

Slide 19 text

Monitor SCE Feedback Loop ● Observe the progress of the experiments ○ Logging ○ Observability ○ Tracing ● Intervene if necessary ○ Stop experiment ○ Recover to good state

Slide 20

Slide 20 text

Analyze SCE Feedback Loop ● Failed - had to stop , need to identify the reasons and figure out how to improve in the future ● Success - Critical to derive answers to the questions posed at the planning stage

Slide 21

Slide 21 text

Analyse SCE Results Using Risk-Driven Methodologies OWASP Risk Rating Methodology https://owasp.org/www-project-top-ten/2017/Application_Security_Risks.html

Slide 22

Slide 22 text

SCE Feedback Loop Plan ● Creating of backlogs ○ Vulnerability management (patching) ○ Security operations ○ Development teams ○ Threat modelling ○ Awareness training ● Next steps ○ Remediate ○ Construct hypothesis for the next iteration

Slide 23

Slide 23 text

SCE Feedback Loop Knowledge-base ● Security automation ○ Create cloudwatch rules to trigger alarms for specific events ○ Create audit rules for CSPM ○ Flag policies with broad permissions ● Security analytics ● Security correlation ● Machine learning

Slide 24

Slide 24 text

Security Knowledgebase SIEM Data Collection Analysis, Visualization & Automation Uniﬁed Query & Storage Threat Intelligence Source Extended Detection & Response Security Chaos Engineering Security Orchestration, Automation & Response Compliance Automation Extract, Transform & Load Security Data Lake