Our research primarily focuses on Multimodal Large Language Model (MLLM), Embodied AI, and AI Agent, with an emphasis on perception, reasoning, and decision-making in interactive environments:
Task instruction: Open the drawer, put the toy inside, and then close it.
To further enhance our embodied AI research, our lab has recently acquired the R1 Lite robot from GaLaXea AI.
Task instruction: Search today's weather in Shenzhen on Chrome, then write the temperature into "today.md" using Markor.
Software provided here is for personal research purposes only. Redistribution and commercial usage are not permitted. Feedback, applications, and further development are welcome. Contact shaorui[AT]hit.edu.cn for bugs and collaborations. All rights of the implementation are reserved by the authors.