Revision as of 20:28, 23 May 2024

A prompt injection attack involves a user injecting a malicious instruction in an LLM-integrated application, in which user input was intended to act as only data.

Vulnerability

Prompt injection exploits the single-channel nature of LLM's, where user prompts and system prompts are simply concatenated together and processed.

Defense strategies

StruQ rejects all user instructions
Instruction hierarchy rejects user instructions that are misaligned with the system prompt

@@ Line 2: / Line 2: @@
 A '''prompt [[injection]]''' attack involves a user injecting a malicious
-instruction in an LLM-integrated application, in which user input was
+instruction in an [[LLM-integrated application]], in which user input was
 intended to act as only data.
+= Vulnerability =
+Prompt injection exploits the single-channel nature of LLM's, where user prompts and system prompts are simply concatenated together and processed.
 = Defense strategies =
 * [[StruQ]] rejects all user instructions
 * [[Instruction hierarchy]] rejects user instructions that are misaligned with the system prompt

Anonymous

Search

Prompt injection: Difference between revisions

Namespaces

More

Page actions

Revision as of 20:28, 23 May 2024

Vulnerability

Defense strategies

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Prompt injection: Difference between revisions

Revision as of 20:28, 23 May 2024

Vulnerability

Defense strategies

Navigation

Wiki tools

Page tools

Categories