FREE RESOURCE

AI Agent Security Checklist

Everything we test during an audit. Use this checklist to evaluate your own agent's security posture.

Prompt Injection Defense8 checks

System prompt is not extractable via direct instruction override
critical
Agent resists 'ignore previous instructions' attacks
critical
Multi-turn context manipulation doesn't bypass safety guidelines
high
Indirect prompt injection via external content (URLs, documents) is blocked
critical
Prompt injection via encoded/obfuscated text (base64, unicode) is detected
high
Role-switching attacks ('pretend you are...') are handled
high
Payload injection via few-shot examples is prevented
medium
Agent maintains instruction hierarchy under adversarial pressure
critical

🔍Data Leakage Prevention7 checks

Training data cannot be extracted through targeted prompts
critical
PII in conversation context is not exposed to unauthorized users
critical
Internal business logic and rules are not revealed
high
API keys, tokens, or credentials are never included in responses
critical
Error messages don't expose internal architecture details
medium
Conversation history from other users is not accessible
critical
Model metadata (version, provider, fine-tuning details) is not leaked
low

🔒Sandbox & Execution Security6 checks

Code execution is sandboxed with no file system access
critical
Network access from execution environment is restricted
critical
Tool/function calls require explicit authorization
high
Resource limits (CPU, memory, time) are enforced
high
Escalation from agent tools to system-level access is impossible
critical
External API calls made by agent are validated and allowlisted
high

📊Output Safety & Quality6 checks

Harmful or illegal content generation is blocked
critical
Output is sanitized to prevent XSS when rendered in web UIs
high
Agent doesn't generate convincing phishing content on request
high
Hallucinated URLs, emails, or phone numbers are filtered
medium
Response length limits prevent denial-of-service via verbose output
medium
Agent gracefully handles unsupported languages and edge-case inputs
low

🔑Authentication & Access Control5 checks

Agent verifies user identity before performing sensitive actions
critical
Rate limiting is enforced to prevent automated attacks
high
Session isolation prevents cross-user data access
critical
Admin/debug modes are not accessible to regular users
critical
API authentication tokens are rotated regularly
medium

Need help with this checklist?

Our automated scanner tests all of these — and more — in under 24 hours.

Get a Professional Audit