HELM

From Rice Wiki
Revision as of 01:44, 19 June 2024 by Rice (talk | contribs) (Created page with "'''HELM''' (Holistic Evaluation of Language Models) is a benchmark for evaluating LLMs for dangers to the user. It checks LLM in many scenarios with many metrics for ethical concerns. Its goal is to be a standardized and holistic language model benchmark. Category:LLM")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

HELM (Holistic Evaluation of Language Models) is a benchmark for evaluating LLMs for dangers to the user. It checks LLM in many scenarios with many metrics for ethical concerns. Its goal is to be a standardized and holistic language model benchmark.