JailGuard: a universal detection framework for prompt-based attacks on LLM systems
The systems and software powered by Large Language Models (LLMs) and Multi-Modal LLMs (MLLMs) have played a critical role in numerous scenarios. However, current LLM systems are vulnerable to prompt-based attacks, with jailbreaking attacks enabling the LLM system to generate harmful content, while h...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
其他作者: | |
格式: | Article |
語言: | English |
出版: |
2025
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/184567 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|