Welcome to the multimOdal peRception, reasonIng, and decisiON (Orion) Lab at HIT(SZ), led by Prof. Rui Shao.
We study intelligent agents based on Multimodal Large Language Model (MLLM) that can perceive, reason, and act through interaction with the world.
Looking for self-motivated Ph.D/M.S./Undergraduate students. [2026硕士/博士招生, 3-4名硕士, 2名博士]
Looking for PostDocs in MLLM, Embodied AI and Agent.
News
- 07/2025: Two papers about MLLM are accepted by ACM MM 2025.
- 06/2025: Three papers about MLLM, AI Agent are accepted by ICCV 2025.
- 06/2025: One paper about Audio-Visual Multimodal Large Language Model is accepted by TPAMI.
- 05/2025: One paper about GUI agent is accepted by ACL Main 2025 .
- 05/2025: Invited to serve as ICMR 2025 Panel Co-Chairs and BMVC 2025 Area Chair.
- 05/2025: One paper about Robot Skill Learning is accepted by ICML 2025 as Spotlight (2.6%).
- 02/2025: Three papers about Ego-Centric video MLLM, MLLM agent and Embodied MLLM are accepted by CVPR 2025.
- 01/2025: One paper about SmartPhone Multimodal Agent accepted by ICLR 2025 as Spotlight (5.1%).
- 12/2024: The extension of our ECCV 2022 paper (SeqDeepFake) has been accepted by IJCV.
- 10/2024: We have built GitHub Orgnization of JiuTian-VL that will post all information about our JiuTian MLLM.
- 10/2024: I have one paper about the adapter of large vision models accepted by IJCV.
- 10/2024: Two papers about MLLMs are accepted at NeurIPS 2024, including contributions from a undergraduate.
- 07/2024: One paper about Audio-Visual Multimodal Large Language Model accepted by ECCV 2024.
- 02/2024: Our Multimodal Large Language Model (MLLM)- JiuTian-LION has been accepted by CVPR 2024.
- 02/2024: The extension of our CVPR 2023 paper has been accepted by TPAMI.
- 08/2023: We have built the GitHub Repo for our Multimodal Large Language Model (MLLM)- JiuTian . Enjoy it!
- 04/2023: I have released the code and dataset of our CVPR 2023 work in our GitHub Repo . Enjoy it!
- 02/2023: I have one paper accepted by CVPR 2023. Code and dataset will be released soon. Please stay tuned!
- 07/2022: I have one paper accepted by ECCV 2022. We have released the code and dataset in our project page
- 05/2022: I have released the code of Federated Generalized Face Presentation Attack Detection in TNNLS 2022. Codes
- 04/2022: I have one paper accepted by TNNLS .
- 03/2022: I have released the code of Open-set Adversarial Defense with Clean-Adversarial Mutual Learning in IJCV 2022. Codes
- 01/2022: The extension of our ECCV 2020 paper has been accepted by IJCV.
- 08/2020: I have released the code of Open-set Adversarial Defense in ECCV 2020. Codes
- 07/2020: I have one paper accepted by ECCV 2020. See you online!
- 11/2019: I have one paper accepted by AAAI 2020. See you at New York City, USA!
- 02/2019: I have one paper accepted by CVPR 2019. See you at Long Beach, USA!
- 02/2019: One paper is accepted by TIE.
- 08/2018: I have one paper accepted by TIFS .
- 07/2018: I have released the code of Hierarchical Adversarial Deep Domain Adaptation in ACMMM 2018. Codes
- 07/2018: I have one paper accepted by ACM MM 2018. See you at Seoul, Korea!
- 08/2018: I have a new homepage.
About Me
I am currently a Professor at
School of Computer Science and Technology,
Harbin Institute of Technology (Shenzhen). Prior to that, I
was a postdoc at Nanyang Technological University, Singapore, working with
Prof. Ziwei Liu and
Prof. Chen Change Loy.
I received my PhD degree from
Department of Computer Science,
Hong Kong Baptist University in 2021, supervised by
Prof. Pong C. Yuen, and my bachelor degree from
School of Information and Communication Engineering,
University of Electronic Science and Technology of China (UESTC) in
2015. I also spent a memorable high-school time in
Shenzhen Foreign Languages School. I visited Johns Hopkins University for
6 months in 2020.
I am interested in computer vision and multimodal learning.
My current research focuses on Multimodal Large Language Model (MLLM) (e.g., JiuTian MLLM)
and its applications on Embodied AI.
Biography
Services
- ICMR 2025 Panel Chairs
- Area Chair: ACM Multimedia 2024, BMVC 2024, BMVC 2025