Prompt attack and defense

From Rice Wiki
Revision as of 22:55, 21 June 2024 by Rice (talk | contribs) (→‎Attack and defense methods)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Attack and defense methods

Name Type Description Paper
GPT Fuzzer Attack Repeatedly mutate attacks to retain effective ones. Outperforms existing methods. 2309.10253v2

Datasets

Name Type Description Paper
TensorTrust Prompt extraction/hijacking

Attack and defense

Gathered from a game. 2311.01011v1