Skip to content

ASGuard-UCI/BALD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Can We Trust Embodied Agents?
Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems

Ruochen Jiao*1     Shaoyuan Xie*2     Justin Yue2     Takami Sato2    
Lixu Wang1     Yixuan Wang1     Qi Alfred Chen2     Qi Zhu1    

1Northwestern University     2University of California, Irvine    
*Equal contribution

   

Overview

Large Language Models (LLMs) are promising for decision-making in embodied AI but pose safety and security risks. We introduce BALD, a framework for Backdoor Attacks on LLM-based systems, exploring attack surfaces and triggers. We propose three attack mechanisms: word injection, scenario manipulation, and knowledge injection. Our experiments on GPT-3.5, LLaMA2, and PaLM2 in autonomous driving and home robot tasks show high success rates and stealthiness. Our findings highlight critical vulnerabilities and the need for robust defenses in embodied LLM systems.

Teaser Figure

📚 Citation

If you find our work or dataset useful, please cite:

@inproceedings{jiao2025canwe,
  title     = {Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied {LLM}-Based Decision-Making Systems},
  author    = {Ruochen Jiao and Shaoyuan Xie and Justin Yue and Takami Sato and Lixu Wang and Yixuan Wang and Qi Alfred Chen and Qi Zhu},
  booktitle = {The Thirteenth International Conference on Learning Representations (ICLR)},
  year      = {2025}
}

Installation

conda create -y -n bald python=3.11
conda activate bald
pip install -r requirements.txt

Dataset

Please refer to dataset/README.md for the dataset structure.

Evaluation

Please refer to eval/README.md for the evaluation code.

Defense

Please refer to defenses/README.md for the defense code.

TODOs

  • Add HighWayEnv dataset and evaluation
  • Add VirtualHome dataset and evaluation

Acknowledgments

About

[ICLR 2025] Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages