Shijie Wang (王世杰)

World models, physical AI, and multimodal foundation models.

Email: wang98thu [AT] gmail [DOT] com

About Me

I am a Research Scientist at NVIDIA Cosmos Lab, where I work on world foundation models and physical AI. I received my Ph.D. in Computer Science from Brown University, advised by Prof. Chen Sun, and my bachelor's degree in software engineering from Tsinghua University.

My research interests involve building physically grounded, reasoning-capable multimodal foundation models, and exploring how they can be integrated into the physical world. During my PhD, I worked as a research intern at Salesforce AI Research, Meta GenAI, Google DeepMind, and Google Research. Feel free to contact me for collaborations and casual chats.

Selected Publications

Cosmos 3: Omnimodal World Models for Physical AI [paper] [website] [code]
NVIDIA Cosmos Team
Technical Report
Evidence-Backed Video Question Answering [paper] [website]
Shijie Wang, Honglu Zhou, Ziyang Wang, Ran Xu, Caiming Xiong, Silvio Savarese, Chen Sun, Juan Carlos Niebles
ECCV 2026
Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding [paper] [website]
Ziyang Wang, Honglu Zhou, Shijie Wang, Junnan Li, Caiming Xiong, Silvio Savarese, Mohit Bansal, Michael S. Ryoo, Juan Carlos Niebles
CVPR Findings 2026
MotiF: Making Text Count in Image Animation with Motion Focal Loss [paper] [website] [benchmark]
Shijie Wang, Samaneh Azadi, Rohit Girdhar, Sai Saketh Rambhatla, Chen Sun, and Xi Yin
CVPR 2025
How Can Objects Help Video-Language Understanding? [paper]
Zitian Tang, Shijie Wang, Junho Cho, Jaewook Yoo, and Chen Sun
ICCV 2025
Learning Visual Grounding from Generative Vision and Language Model [paper]
Shijie Wang, Dahun Kim, Ali Taalimi, Chen Sun, and Weicheng Kuo
WACV 2025
Vamos: Versatile Action Models for Video Understanding [paper] [website] [code]
Shijie Wang, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, and Chen Sun
ECCV 2024
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos? [paper] [website] [code]
Qi Zhao*, Shijie Wang*, Ce Zhang, Changcheng Fu, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, and Chen Sun
ICLR 2024

Experience

04/2026 - Now Research Scientist at NVIDIA Cosmos Lab.

06/2025 - 04/2026 Research Intern at Salesforce AI Research with Dr. Juan Carlos Niebles and Dr. Honglu Zhou.

05/2024 - 11/2024 Research Scientist Intern at Meta GenAI with Dr. Xi Yin.

09/2023 - 03/2024 Student Researcher at Google DeepMind with Dr. Weicheng Kuo.

05/2022 - 12/2022 Student Researcher at Google Research with Dr. Yin Cui.

Education

09/2021 - 04/2026 Ph.D. in Computer Science, Brown University

08/2016 - 06/2021 B.S. in Software Engineering, Tsinghua University. (Outstanding Undergrad)

Service

Reviewer: TPAMI, IJCV, ICLR, ICML, NeurIPS, CVPR, ICCV, ECCV, AAAI, and WACV.