JailGuard: a universal detection framework for prompt-based attacks on LLM systems

The systems and software powered by Large Language Models (LLMs) and Multi-Modal LLMs (MLLMs) have played a critical role in numerous scenarios. However, current LLM systems are vulnerable to prompt-based attacks, with jailbreaking attacks enabling the LLM system to generate harmful content, while h...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhang, Xiaoyu, Zhang, Cen, Li, Tianlin, Huang, Yihao, Jia, Xiaojun, Hu, Ming, Zhang, Jie, Liu, Yang, Ma, Shiqing, Shen, Chao
Other Authors:	College of Computing and Data Science
Format:	Article
Language:	English
Published:	2025
Subjects:	Computer and Information Science LLM security Software and application security
Online Access:	https://hdl.handle.net/10356/184567
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://hdl.handle.net/10356/184567

JailGuard: a universal detection framework for prompt-based attacks on LLM systems

Internet

Similar Items