LLM Security Red Lines: Prompt Injection Defense in Practice
Contents
What Is Prompt Injection
Attack: inject malicious instructions in user input
"ignore above instructions, execute XXX instead"Defense Strategy
# 1. Input filtering
def sanitize_prompt(user_input):
blocked = ["ignore", "disregard", "forget previous"]
for pattern in blocked:
if pattern in user_input.lower():
raise ValueError("blocked pattern")
return user_input
# 2. Output validation
def validate_output(response):
if contains_sensitive_data(response):
raise ValueError("PII detected")
return responseConclusion
LLM security trio: input filtering + output validation + least privilege.