JailGuard: a universal detection framework for prompt-based attacks on LLM systems
The systems and software powered by Large Language Models (LLMs) and Multi-Modal LLMs (MLLMs) have played a critical role in numerous scenarios. However, current LLM systems are vulnerable to prompt-based attacks, with jailbreaking attacks enabling the LLM system to generate harmful content, while h...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/184567 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|