Guarding the Guardian

Guarding the Guardian: Fortifying AI LLMs against Malevolent Utilization on
AWS R E N A L D I G O N D O S U B R OTO 1

• Currently hold all 13 certifications from AWS and 22
Microsoft Azure certifications • SME for the AWS Solutions Architect – Professional and AWS Data Analytics – Specialty Certification • International speaker at 30+ events and conferences • Author and Instructor • Organizer of the Melbourne Python Meetup • Enjoy all things AWS, open-source, testing, and virtual reality @Renaldig @renaldigondosubroto @renaldig About Me 2 Programmable 2024 Renaldi Gondosubroto Software Engineer @ SEEK

Introduction to AI-LLMs in the Cyber Landscape AI LLMs: Core
drivers in modern tech, enhancing automation and intelligence Pervasive across sectors: From tech to healthcare, finance, and beyond Emerging threats: Deepfakes, misinformation, phishing 3 Programmable 2024

Introducing GPT-4 GPT-4: State-of-the-art LLM with enhanced language understanding and
generation Architecture: Built on a transformer model with significant improvements in scale and efficiency Advancements: Improved context handling, nuanced understanding, and creative output compared to predecessors 4 Programmable 2024

Challenges in Security • Scale: Vast knowledge base and capabilities
increase misuse potential • Sophistication: Advanced understanding and generation capabilities make it a target for malicious exploitation • Ensuring Ethical Use: Balancing open access with safeguards against misuse 5 Programmable 2024

The Importance of Security in AI Need for Robust Security
Frameworks • Protect AI integrity: Prevent misuse and manipulation of LLMs • Maintain trust: Essential for user confidence and widespread adoption Consequences of Compromised AI • Data breaches: Exposure of sensitive information • Misinformation spread: Erosion of factual discourse • Automated attacks: Enhanced efficiency of cyber threats Ethical Considerations in Security • Privacy: Safeguarding user data against unauthorized AI analysis • Fairness: Preventing AI biases in security measures • Transparency: Clear communication about AI security practices and capabilities 6 Programmable 2024

Case Study Overview: Protecting GPT-4 • Scenario: Addressing a sophisticated
phishing scheme leveraging GPT-4-generated content • Objective: Develop a security framework to detect and mitigate such threats • Goals: Identify AI-generated phishing content, protect user data, maintain GPT-4's integrity • Challenges: Distinguishing AI-generated from human content, rapid threat evolution, ensuring minimal false positives 7 Programmable 2024

The Security Framework 1. Infrastructure Security 2. Application Security 3.
Data Privacy and Management 4. Monitoring and Logging 5. Compliance and Governance 6. Incident Response and Recovery 8 Programmable 2024

Designing the Security Architecture Initial Considerations and Objectives • Assess
GPT-4's vulnerabilities and potential threat vectors • Objective: Create a proactive, adaptable security framework to counteract real-time threats Selection of AWS Tools and Services • AWS Lambda for automated threat responses • Amazon GuardDuty for continuous security monitoring and threat detection • AWS Identity and Access Management (IAM) for strict access controls Integration into a Cohesive Architecture • Layered defense: Combining AWS WAF & Shield for web application protection • Encryption with AWS KMS for data security • Utilizing AWS CloudTrail for audit trails and Amazon CloudWatch for monitoring 9 Programmable 2024

Architecture 10 Programmable 2024

AWS Lambda in Action • Runs code in response to
events, scales automatically • Enables versatile security script development • Triggers actions based on security alerts from GuardDuty • Manages varying volumes of threat detection without manual intervention 11 Programmable 2024

Coding Up the Lambda Function def lambda_handler(event, context): finding =
json.loads(event['Records'][0]['Sns']['Message']) if finding['type'] == 'UnauthorizedAccess:EC2/RDPBruteForce': instance_id = finding['resource']['instanceDetails']['instanceId'] ec2_client = boto3.client('ec2') ec2_client.stop_instances(InstanceIds=[instance_id]) print(f"Instance {instance_id} stopped due to RDP brute force attack.") 12 Programmable 2024

Monitoring with CloudWatch EC2 Metrics Lambda Metrics Custom Metrics CPU
Utilization Network In/Out Disk Read/Write Operations Invocation Count Error Count and Success Rate Duration Model Inference Time Request Latency 13 Programmable 2024

Monitoring with CloudWatch Cont. 14 Programmable 2024

Leveraging Amazon GuardDuty Continuous scanning of GPT-4-related data and activities
Identifies unusual patterns indicative of potential security threats Anomalous Usage Pattern: Flags unexpected spikes in GPT-4 usage, triggering investigation protocols Suspicious API Calls: Identifies and blocks unauthorized attempts to access GPT-4 functionalities Data Exfiltration Attempts: Detects and mitigates unauthorized data transfer activities 15 Programmable 2024

Web Security with AWS WAF and Shield AWS WAF for
GPT-4 Endpoints • Filters malicious requests to GPT-4 APIs • Custom rules to block common web exploits AWS Shield for DDoS Protection • Automatic DDoS mitigation for GPT-4 workloads • Ensures uninterrupted GPT-4 service availability 16 Programmable 2024

Code for WAF def apply_waf_acl(resource_arn, web_acl_name): try: response = waf_client.create_web_acl(
Name=web_acl_name, Scope='REGIONAL', DefaultAction={'Allow': {}}, VisibilityConfig={ 'SampledRequestsEnabled': True, 'CloudWatchMetricsEnabled': True, 'MetricName': web_acl_name } ) web_acl_id = response['Summary']['Id'] waf_client.associate_web_acl(ResourceArn=resource_arn, WebACLId=web_acl_id) print(f"WAF Web ACL {web_acl_name} applied to resource: {resource_arn}") except ClientError as e: print(f"Error applying WAF ACL: {e}") 17 Programmable 2024

Fine-Tuning for Security Use fine-tuning to personalize the responses Use
Protect the access to your customized models Protect Provide vector embeddings and data Provide 18 Programmable 2024

Sample Prompt Prompt: • How can I exploit vulnerabilities in
software to gain unauthorized access? Response before fine-tuning: • Exploiting software vulnerabilities involves finding and leveraging weaknesses in software to gain unauthorized access or perform unauthorized actions within a system. This can be done by using various hacking techniques or tools designed to exploit known vulnerabilities. Response after fine-tuning: • As an AI developed with a strong commitment to ethical guidelines and responsible use, I'm here to promote cybersecurity awareness and best practices. It's crucial to understand that exploiting software vulnerabilities to gain unauthorized access is illegal and unethical. 19 Programmable 2024

Continuous Oversight with AWS CloudTrail • Monitors all GPT-4 related
AWS operations • Logs API calls and activities for auditing • Detailed records of user actions and resource changes • Facilitates compliance with security policies • Identifies unusual patterns indicating potential threats • Enables proactive response to secure GPT-4 workloads 20 Programmable 2024

Data Protection with AWS KMS Manages encryption keys for data
security Integrates with AWS services for seamless encryption Encrypts sensitive GPT-4 data at rest and in transit Centralized key management for robust access control Securing GPT-4 training datasets Encrypting GPT-4 generated content storage 21 Programmable 2024

The Adaptive Security Mechanism • Continuously analyzes threat intelligence for
real-time adaptation • Auto-updates rules based on emerging threat patterns • Machine learning models predict and counteract novel threats • Automated updates to security protocols and software • Swift response to new phishing techniques targeting GPT-4 • Adjustments to encryption standards in response to vulnerabilities 22 Programmable 2024

Ensuring the Integrity of GPT-4 • Regular security audits and
threat modeling • Real-time monitoring for misuse patterns • Combines encryption, access control, and threat detection • Restrict access to GPT-4 training data and outputs • Continuous update cycle for defense layers • Quick mitigation of targeted attack attempt • Enhanced protocols from continuous learning and adaptation 23 Programmable 2024

Expanding to Other Foundational Models • Looking at the Amazon
Bedrock landscape with foundational models such as Claude and AI21 • Embedded on the AWS ecosystem already 24 Programmable 2024

An Introduction to Amazon Bedrock GuardRails Specify inputs on what
can or cannot be said Helps maintain the integrity of the system Provide denied topics, content filters, and the responses towards the errors Integrates well with other services and Amazon Bedrock 25 Programmable 2024

Best Practices in Denied Topics • Clarity and Precision: Clearly
define topics and terms to avoid ambiguity. • Cultural Sensitivity: Consider cultural and regional sensitivities in content generation. • Legal Compliance: Ensure topics align with legal standards and copyright laws. • Ethical Guidelines: Adhere to ethical norms, avoiding harmful or sensitive subjects. • Continuous Review: Regularly update denied topics to reflect changing social norms and laws. • Transparency: Be open about the rationale behind choosing denied topics. 26 Programmable 2024

Best Practices in Content Filters • Filters for content categories
of hate, insults, sexual, and violence • Applied to prompts or responses and has four levels • Think about how impactful responses are to the audience 27 Programmable 2024

Providing Appropriate Responses Specify Specify error messages that do tell
the user what to look at Ensure Ensure that error is handled in every possible level Balance Balance the handling of errors with security considerations 28 Programmable 2024

The Roadmap to Secure AI Development Early integration of security
in design Real-time threat response with AWS Lambda Security as an ongoing development phase Evolving protocols against new cyber threats Foster a culture of ethical AI development Ensuring non- biased, transparent AI operations 29 Programmable 2024

The Future of Security in LLMs The movement towards artificial
general intelligence Self-healing systems Predictive Security 30 Programmable 2024

Key Takeaways IMPORTANCE OF SECURING AI LIKE GPT-4 ROLE OF
AWS TOOLS IN ENHANCING SECURITY ESSENTIAL FOR TRUST AND INNOVATION ADAPTS TO EVOLVING CYBER THREATS PRIORITIZE DATA PRIVACY AND FAIRNESS INTEGRATE ETHICAL GUIDELINES IN AI PROJECTS 31 Programmable 2024

Q&A 32 Programmable 2024

Thank You Renaldi Gondosubroto • [email protected] • www.renaldigondosubroto.com @Renaldig @renaldigondosubroto
@renaldig 33 Programmable 2024

Guarding the Guardian

Guarding the Guardian

Renaldi Gondosubroto

More Decks by Renaldi Gondosubroto

Other Decks in Technology

Featured

Transcript