Prompt injection
From Rice Wiki
A prompt injection attack involves a user injecting a malicious
instruction in an LLM-integrated application, in which user input was
intended to act as only data.
Defense strategies
- StruQ rejects all user instructions
- Instruction hierarchy rejects user instructions that are misaligned with the system prompt