AI Security

Provided by

Enquire about this course

Overview

Organisations must understand how to secure their AI systems. This in-depth course delves into the AI security landscape, addressing vulnerabilities like prompt injection, denial of service attacks, model theft, and more. Learn how attackers exploit these weaknesses and gain hands-on experience with proven defense strategies and security APIs.

Discover how to securely integrate LLMs into your applications, safeguard training data, build robust AI infrastructure, and ensure effective human-AI interaction. By the end of this course, you'll be equipped to protect your organization's AI assets and maintain the integrity of your systems.

Prerequisites

No prerequisites, aside general understanding of AI principles.

Delegates will learn how to

This course will cover the following topics:
  • Introduction to AI Security
  • Types of AI Systems and Their Vulnerabilities
  • Understanding and Countering AI-specific Attacks
  • Ethical and Reliable AI
  • Prompt Injection
  • Model Jailbreaks and Extraction Techniques
  • Visual Prompt Injection
  • Denial of Service Attacks
  • Secure LLM Integration
  • Training Data Manipulation
  • Human-AI Interaction
  • Secure AI Infrastructure
Learning Outcomes
  • Gain a comprehensive understanding of AI technologies and the unique security risks they pose
  • Learn to identify and mitigate common AI vulnerabilities
  • Gain practical skills in securely integrating LLMs into applications
  • Understand the principles of responsible, reliable, and explainable AI
  • Familiarize themselves with security best practices for AI systems
  • Stay updated with the evolving threat landscape in AI security
  • Engage in hands-on exercises that simulate real-world scenarios
Outline

Day 1

Introduction to AI security
  • What is AI Security?
    • Defining AI
    • Defining Security
    • AI Security scope
    • Beyond this course
  • Different types of AI systems
    • Neural networks
    • Models
    • Integrated AI systems
  • From Prompts to Hacks
    • Use-cases of AI systems
    • Attacking Predictive AI systems
    • Attacking Generative AI systems
    • Interacting with AI systems
  • What does 'Secure AI' mean?
    • Responsible AI
    • Reliable, trustworthy AI
    • Explainable AI
    • A word on alignment
    • To censor or not to censor
  • Exercise: Using an uncensored model
    • Using an uncensored model
Using AI for malicious intents
  • Deepfake scam earns $25M
    • You would never believe, until you do
    • Behind deep fake technology
  • Voice cloning for the masses
    • Imagine yourself in their shoes
    • Technological dissipation
  • Social engineering on steroids
  • Levelling the playing field
  • Profitability from the masses
  • Shaking the fundamentals of reality
  • Donald Trump arrested
  • Pentagon explosion shakes the US stock market
  • How humans amplify a burning Eiffel tower
  • Image watermarking by OpenAI
  • Exercise: Image watermarking
    • Real or fake?
The AI Security landscape
  • Attack surface of an AI system
    • Components of an AI system
    • AI systems and model lifecycle
    • Supply-chain is more important than ever
    • Models accessed via APIs
    • APIs access by models
    • Non-AI attacks are here to stay
  • OWASP Top 10 and AI
    • About OWASP and it's Top 10 lists
    • OWASP ML Top 10
    • OWASP LLM Top 10
    • Beyond OWASP Top 10
  • Threat modeling an LLM integrated application
    • A quick recap on threat modeling
    • A sample AI-integrated application
    • Sample findings
    • Mitigations
  • Exercise: Threat modeling an LLM integrated application
    • Meet TicketAI, a ticketing system
    • TicketAI's data flow diagram
    • Find potential threats
Prompt Injection
  • Attacks on AI systems - Prompt injection
    • Prompt injection
    • Impact
    • Examples
    • Indirect prompt injection
    • From prompt injection to phishing
  • Advanced techniques - SudoLang: pseudocode for LLMs
    • Introducing SudoLang
    • SudoLang examples
    • Behind the tech
    • A SudoLang program
    • Integrating an LLM
    • Integrating an LLM with SudoLang
  • Exercise: Translate a prompt to SudoLang
    • A long prompt
    • A different solution
  • Exercise: Prompt injection - Get the password for levels 1 and 2
    • Get the password!
    • Classic injection defense
    • Levels 1-2
    • Solutions for levels 1-2
Day 2

Prompt Injection
  • Attacks on AI systems - Model jailbreaks
    • What's a model jailbreak?
    • How jailbreaks work?
  • Jailbreaking ChatGPT
    • The most famous ChatGPT jailbreak
    • The 6.0 DAN prompt
    • AutoDAN
  • Exercise: Jailbreaking - Get the password for levels 3, 4, and 5
    • Get the password!
    • Levels 3-5
    • Use DAN against levels 3-5
  • Tree of Attacks with Pruning (TAP)
    • Tree of Attacks explained
  • Attacks on AI systems - Prompt extraction
    • Prompt extraction
  • Exercise: Prompt Extraction - Get the password for levels 6 and 7
    • Get the password!
    • Level 6
    • Level 7
    • Extract the boundaries of levels 6 and 7
  • Defending AI systems - Prompt injection defenses
    • Intermediate techniques
    • Advanced techniques
    • More Security APIs
    • ReBuff example
    • Llama Guard
    • Lakera
  • Attempts against a similar exercise
    • Gandalf from Lakera
    • Types of Gandalf exploits
  • Exercise: The Real Challenge - Get the password for levels 8 and 9
    • Get the password!
    • Level 8
    • Level 9
  • Other injection methods
    • Attack categories
    • Reverse Psychology
  • Exercise: Reverse Psychology
    • Write an exploit with the ChatbotUI
  • Other protection methods
    • Protection categories
    • A different categorization
    • Bergeron method
  • Sensitive Information Disclosure
    • Relevance
    • Best practices
Visual Prompt Injection
  • Attack types
    • New Tech, New Threats
    • Trivial examples
    • Adversarial attacks
  • Tricking self-driving cars
    • How to fool a Tesla
    • This is just the beginning
  • Exercise: Image recognition with OpenAI
    • Invisible message
    • Instruction on image
  • Exercise: Adversarial attack
    • Untargeted attack with Fast Gradient Signed Method (FGSM)
    • Targeted attack
  • Protection methods
    • Protection methods
Denial of Service
  • Chatbot examples
    • Attack scenarios
    • Denial of Service
    • DoS attacks on LLMs
    • Risks and Consequences of DoS Attacks on LLMs
  • Prompt routing challenges
    • Attacks
    • Protections
  • Exercise: Denial of Service
    • Halting Model Responses
Model theft
  • Know your enemy
    • Risks
  • Attack types
    • Training or fine-tuning a new model
    • Dataset exploration
  • Exercise: Query-based model stealing
    • OpenAI API parameters
    • How to steal a model
  • Protection against model theft
    • Simple protections
    • Advanced protections
Day 3

LLM integration
  • The LLM trust boundary
    • An LLM is a system just like any other
    • It's not like any other system
    • Classical problems in novel integrations
    • Treating LLM output as user input
    • Typical exchange formats
    • Applying common best practices
  • Exercise: SQL Injection via an LLM
  • Exercise: Generating XSS payloads
  • LLMs interaction with other systems
    • Typical integration patterns
    • Function calling dangers
    • The rise of custom GPTs
    • Identity and authorization across applications
  • Exercise: Making a call with invalid parameters
  • Exercise: Privilege escalation via prompt injection
  • Principles of security and secure coding
  • Racking up privileges
    • The case for a very capable model
    • Exploiting excessive privileges
    • Separation of privileges
    • A model can't be cut in half
    • Designing your model privileges
  • A customer support bot going wild
  • Exercise: Breaking out of a sandbox
  • Best practices in practice
    • Input validation
    • Output encoding
    • Use frameworks
Training data manipulation
  • What you train on matters
    • What data are models trained on?
    • Model assurances
    • Model and dataset cards
  • Exercise: Verifying model cards
  • A malicious model
  • A malicious dataset
    • Datasets and their reliability
    • Attacker goals and intents
    • Effort versus payoff
    • Techniques to poison datasets
  • Exercise: Let's construct a malicious dataset
  • Verifying datasets
    • Getting clear on objectives
    • A glance at the dataset card
    • Analysing a dataset
  • Exercise: Analysing a dataset
  • A secure supply chain
    • Proving model integrity is hard
    • Cryptographic solutions are emerging
    • Hardware-assisted attestation
Human-AI interaction
  • Relying too much on LLM output
    • What could go wrong?
    • Countering hallucinations
    • Verifying the verifiable
    • Referencing what's possible
    • The use of sandboxes
    • Building safe APIs
    • Clear communication is key
  • Exercise: Verifying model output
Secure AI infrastructure
  • Requirements of a secure AI infrastructure
    • Monitoring and observability
    • Traceability
    • Confidentiality
    • Integrity
    • Availability
    • Privacy
  • Privacy and the Samsung data leak
  • LangSmith
  • Exercise: Experimenting with LangSmith
  • BlindLlama

Enquire

Start date Location / delivery
No fixed date QA On-Line Virtual Centre, Virtual Book now
01132207150 01132207150

Related article

The Cyber Pulse is QA's new portal to free Cyber content, including on-demand webinars, articles written by leading experts,