This is source code accompanying the paper of Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks by Han Wang, Gang Wang, and Huan Zhang. In ...
⭐ If SSDiff is helpful to your paper or project, please consider star this repo or cite our paper. Thanks! 🤗 2025.12.07: Codes are relased. 2025.12.03: Checkpoints and scripts are relased. 2025.12.02 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results