Jailbreak

From Rice Wiki
Revision as of 20:45, 23 May 2024 by Rice (talk | contribs) (Created page with "Category:LLM security '''Jailbreaking''' is a classification of attacks that attempts to defeat LLMs' safety-tuning (usually to avoid inappropriate output) by the model provider.")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jailbreaking is a classification of attacks that attempts to defeat LLMs' safety-tuning (usually to avoid inappropriate output) by the model provider.