AI Security
Provided by QA
Overview
Organisations must understand how to secure their AI systems. This in-depth course delves into the AI security landscape, addressing vulnerabilities like prompt injection, denial of service attacks, model theft, and more. Learn how attackers exploit these weaknesses and gain hands-on experience with proven defense strategies and security APIs.
Discover how to securely integrate LLMs into your applications, safeguard training data, build robust AI infrastructure, and ensure effective human-AI interaction. By the end of this course, you'll be equipped to protect your organization's AI assets and maintain the integrity of your systems.
+
Prerequisites
No prerequisites, aside general understanding of AI principles.
+
Delegates will learn how to
This course will cover the following topics:
Outline
Day 1
Introduction to AI security
Prompt Injection
LLM integration
Organisations must understand how to secure their AI systems. This in-depth course delves into the AI security landscape, addressing vulnerabilities like prompt injection, denial of service attacks, model theft, and more. Learn how attackers exploit these weaknesses and gain hands-on experience with proven defense strategies and security APIs.
Discover how to securely integrate LLMs into your applications, safeguard training data, build robust AI infrastructure, and ensure effective human-AI interaction. By the end of this course, you'll be equipped to protect your organization's AI assets and maintain the integrity of your systems.
+
Prerequisites
No prerequisites, aside general understanding of AI principles.
+
Delegates will learn how to
This course will cover the following topics:
- Introduction to AI Security
- Types of AI Systems and Their Vulnerabilities
- Understanding and Countering AI-specific Attacks
- Ethical and Reliable AI
- Prompt Injection
- Model Jailbreaks and Extraction Techniques
- Visual Prompt Injection
- Denial of Service Attacks
- Secure LLM Integration
- Training Data Manipulation
- Human-AI Interaction
- Secure AI Infrastructure
- Gain a comprehensive understanding of AI technologies and the unique security risks they pose
- Learn to identify and mitigate common AI vulnerabilities
- Gain practical skills in securely integrating LLMs into applications
- Understand the principles of responsible, reliable, and explainable AI
- Familiarize themselves with security best practices for AI systems
- Stay updated with the evolving threat landscape in AI security
- Engage in hands-on exercises that simulate real-world scenarios
Outline
Day 1
Introduction to AI security
- What is AI Security?
- Defining AI
- Defining Security
- AI Security scope
- Beyond this course
- Different types of AI systems
- Neural networks
- Models
- Integrated AI systems
- From Prompts to Hacks
- Use-cases of AI systems
- Attacking Predictive AI systems
- Attacking Generative AI systems
- Interacting with AI systems
- What does 'Secure AI' mean?
- Responsible AI
- Reliable, trustworthy AI
- Explainable AI
- A word on alignment
- To censor or not to censor
- Exercise: Using an uncensored model
- Using an uncensored model
- Deepfake scam earns $25M
- You would never believe, until you do
- Behind deep fake technology
- Voice cloning for the masses
- Imagine yourself in their shoes
- Technological dissipation
- Social engineering on steroids
- Levelling the playing field
- Profitability from the masses
- Shaking the fundamentals of reality
- Donald Trump arrested
- Pentagon explosion shakes the US stock market
- How humans amplify a burning Eiffel tower
- Image watermarking by OpenAI
- Exercise: Image watermarking
- Real or fake?
- Attack surface of an AI system
- Components of an AI system
- AI systems and model lifecycle
- Supply-chain is more important than ever
- Models accessed via APIs
- APIs access by models
- Non-AI attacks are here to stay
- OWASP Top 10 and AI
- About OWASP and it's Top 10 lists
- OWASP ML Top 10
- OWASP LLM Top 10
- Beyond OWASP Top 10
- Threat modeling an LLM integrated application
- A quick recap on threat modeling
- A sample AI-integrated application
- Sample findings
- Mitigations
- Exercise: Threat modeling an LLM integrated application
- Meet TicketAI, a ticketing system
- TicketAI's data flow diagram
- Find potential threats
- Attacks on AI systems - Prompt injection
- Prompt injection
- Impact
- Examples
- Indirect prompt injection
- From prompt injection to phishing
- Advanced techniques - SudoLang: pseudocode for LLMs
- Introducing SudoLang
- SudoLang examples
- Behind the tech
- A SudoLang program
- Integrating an LLM
- Integrating an LLM with SudoLang
- Exercise: Translate a prompt to SudoLang
- A long prompt
- A different solution
- Exercise: Prompt injection - Get the password for levels 1 and 2
- Get the password!
- Classic injection defense
- Levels 1-2
- Solutions for levels 1-2
Prompt Injection
- Attacks on AI systems - Model jailbreaks
- What's a model jailbreak?
- How jailbreaks work?
- Jailbreaking ChatGPT
- The most famous ChatGPT jailbreak
- The 6.0 DAN prompt
- AutoDAN
- Exercise: Jailbreaking - Get the password for levels 3, 4, and 5
- Get the password!
- Levels 3-5
- Use DAN against levels 3-5
- Tree of Attacks with Pruning (TAP)
- Tree of Attacks explained
- Attacks on AI systems - Prompt extraction
- Prompt extraction
- Exercise: Prompt Extraction - Get the password for levels 6 and 7
- Get the password!
- Level 6
- Level 7
- Extract the boundaries of levels 6 and 7
- Defending AI systems - Prompt injection defenses
- Intermediate techniques
- Advanced techniques
- More Security APIs
- ReBuff example
- Llama Guard
- Lakera
- Attempts against a similar exercise
- Gandalf from Lakera
- Types of Gandalf exploits
- Exercise: The Real Challenge - Get the password for levels 8 and 9
- Get the password!
- Level 8
- Level 9
- Other injection methods
- Attack categories
- Reverse Psychology
- Exercise: Reverse Psychology
- Write an exploit with the ChatbotUI
- Other protection methods
- Protection categories
- A different categorization
- Bergeron method
- Sensitive Information Disclosure
- Relevance
- Best practices
- Attack types
- New Tech, New Threats
- Trivial examples
- Adversarial attacks
- Tricking self-driving cars
- How to fool a Tesla
- This is just the beginning
- Exercise: Image recognition with OpenAI
- Invisible message
- Instruction on image
- Exercise: Adversarial attack
- Untargeted attack with Fast Gradient Signed Method (FGSM)
- Targeted attack
- Protection methods
- Protection methods
- Chatbot examples
- Attack scenarios
- Denial of Service
- DoS attacks on LLMs
- Risks and Consequences of DoS Attacks on LLMs
- Prompt routing challenges
- Attacks
- Protections
- Exercise: Denial of Service
- Halting Model Responses
- Know your enemy
- Risks
- Attack types
- Training or fine-tuning a new model
- Dataset exploration
- Exercise: Query-based model stealing
- OpenAI API parameters
- How to steal a model
- Protection against model theft
- Simple protections
- Advanced protections
LLM integration
- The LLM trust boundary
- An LLM is a system just like any other
- It's not like any other system
- Classical problems in novel integrations
- Treating LLM output as user input
- Typical exchange formats
- Applying common best practices
- Exercise: SQL Injection via an LLM
- Exercise: Generating XSS payloads
- LLMs interaction with other systems
- Typical integration patterns
- Function calling dangers
- The rise of custom GPTs
- Identity and authorization across applications
- Exercise: Making a call with invalid parameters
- Exercise: Privilege escalation via prompt injection
- Principles of security and secure coding
- Racking up privileges
- The case for a very capable model
- Exploiting excessive privileges
- Separation of privileges
- A model can't be cut in half
- Designing your model privileges
- A customer support bot going wild
- Exercise: Breaking out of a sandbox
- Best practices in practice
- Input validation
- Output encoding
- Use frameworks
- What you train on matters
- What data are models trained on?
- Model assurances
- Model and dataset cards
- Exercise: Verifying model cards
- A malicious model
- A malicious dataset
- Datasets and their reliability
- Attacker goals and intents
- Effort versus payoff
- Techniques to poison datasets
- Exercise: Let's construct a malicious dataset
- Verifying datasets
- Getting clear on objectives
- A glance at the dataset card
- Analysing a dataset
- Exercise: Analysing a dataset
- A secure supply chain
- Proving model integrity is hard
- Cryptographic solutions are emerging
- Hardware-assisted attestation
- Relying too much on LLM output
- What could go wrong?
- Countering hallucinations
- Verifying the verifiable
- Referencing what's possible
- The use of sandboxes
- Building safe APIs
- Clear communication is key
- Exercise: Verifying model output
- Requirements of a secure AI infrastructure
- Monitoring and observability
- Traceability
- Confidentiality
- Integrity
- Availability
- Privacy
- Privacy and the Samsung data leak
- LangSmith
- Exercise: Experimenting with LangSmith
- BlindLlama
Enquire
Start date | Location / delivery | |
---|---|---|
14 Jul 2025 | QA On-Line Virtual Centre, Virtual | Book now |
01132207150
01132207150