Prompt attack and defense

From Rice Wiki

Attack and defense methods

Name Type Description Paper
GPT Fuzzer Attack Repeatedly mutate attacks to retain effective ones. Outperforms existing methods. 2309.10253v2

Datasets

Name Type Description Paper
TensorTrust Prompt extraction/hijacking

Attack and defense

Gathered from a game. 2311.01011v1