Zhouxia Wang

I am currently a Tenure-Track Associate Professor with the HCP at the School of Computer Science and Engineering, Sun Yat-sen University. Previously, I was a Research Fellow at MMLab@NTU, Nanyang Technological University, Singapore, where I worked with Prof. Chen Change Loy. I received my Ph.D. from MMLab@HKU at The University of Hong Kong, advised by Prof. Ping Luo and Prof. Wenping Wang. I obtained both my master's and bachelor's degrees from Sun Yat-sen University under the supervision of Prof. Liang Lin.

Open Positions

We welcome applicants interested in AI research:

Postdoctoral researchers
Ph.D. and graduate students
Outstanding research assistants
Outstanding undergraduate students

课题组招收对人工智能研究感兴趣的优秀同学：

博士后研究人员
博士生和研究生
优秀科研助理
优秀本科生

Research Areas

Controllable image and video generation
Image and video restoration
Embodied AI and active perception
Visual Understanding

Featured Work

Selected projects and representative directions

A quick entry point into recent work on camera control, motion control, restoration, and robot manipulation.

IJCV 2026

ObjCtrl-2.5D

Training-free object control with explicit camera pose manipulation for flexible image generation.

Paper Project Code

SIGGRAPH 2024

MotionCtrl

A unified and flexible motion controller for video generation across multiple control signals.

Paper Project Code

CVPR 2022/TPAMI 2023/TIP 2024

RestoreFormer/++

Blind face restoration from undegraded key-value pairs, designed for challenging real-world scenarios.

Paper Code Demo

CVPR 2026

Learning to See and Act

Task-aware virtual view exploration for robotic manipulation with embodied visual perception.

Project Paper Benchmark Code

Publications

Selected publications

Representative papers across controllable generation, restoration, and visual understanding. See the full publication list on Google Scholar.

2026

Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation

Yongjie Bai*, Zhouxia Wang*, Yang Liu, Kaijun Luo, Yifan Wen, Mingtong Dai, Weixing Chen, Ziliang Chen, Lingbo Liu, Guanbin Li, Liang Lin

CVPR, 2026

Project Paper Benchmark Code

* indicates equal contribution.

BibTeX

@inproceedings{bai2026learning,
title={Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation},
author={Bai, Yongjie and Wang, Zhouxia and Liu, Yang and Luo, Kaijun and Wen, Yifan and Dai, Mingtong and Chen, Weixing and Chen, Ziliang and Liu, Lingbo and Li, Guanbin and Lin, Liang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}

Next Visual Granularity Generation

Yikai Wang, Zhouxia Wang, Zhonghua Wu, Qingyi Tao, Kang Liao, Chen Change Loy

ICLR, 2026

Project Paper Code

BibTeX

@inproceedings{wang2026nvg,
title={Next Visual Granularity Generation},
author={Wang, Yikai and Wang, Zhouxia and Wu, Zhonghua and Tao, Qingyi and Liao, Kang and Loy, Chen Change},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}

ObjCtrl-2.5D: Training-free Object Control with Camera Poses

Zhouxia Wang, Yushi Lan, Shangchen Zhou, and Chen Change Loy

International Journal of Computer Vision (IJCV), 2026

PDF Project Code Demo

BibTeX

@inproceedings{objctrl2.5d,
title={{ObjCtrl-2.5D}: Training-free Object Control with Camera Poses},
author={Wang, Zhouxia and Lan, Yushi and Zhou, Shangchen and Loy, Chen Change},
booktitle={IJCV},
year={2026}
}

FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models

Haonan Qiu, Zhaoxi Chen, Zhouxia Wang, Yingqing He, Menghan Xia, Ziwei Liu

International Journal of Computer Vision (IJCV), 2026

PDF Project Code

BibTeX

@misc{qiu2024freetraj,
title={FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models},
author={Haonan Qiu and Zhaoxi Chen and Zhouxia Wang and Yingqing He and Menghan Xia and Ziwei Liu},
year={2026},
booktitle={IJCV},
}

ObjectClear: Precise Object and Effect Removal with Adaptive Target-Aware Attention

Jixin Zhao, Zhouxia Wang, Peiqing Yang, Shangchen Zhou

CVPR, 2026

PDF Project Code

BibTeX

@inproceedings{zhao2025ObjectClear,
title={{ObjectClear}: Complete Object Removal via Object-Effect Attention},
author={Zhao, Jixin and Zhou, Shangchen and Wang, Zhouxia and Yang, Peiqing and Loy, Chen Change},
booktitle={CVPR},
year={2026}
}

2025

Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration

Kang Liao, Zongsheng Yue, Zhouxia Wang, Chen Change Loy

ICLR, 2025

PDF Project Code

BibTeX

@article{liao2024denoising,
title={Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration},
author={Liao, Kang and Yue, Zongsheng and Wang, Zhouxia and Loy, Chen Change},
journal={ICLR},
year={2025}
}

Image Conductor: Precision Control for Interactive Video Synthesis

Yaowei Li, Xintao Wang, Zhaoyang Zhang, Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, and Ying Shan

AAAI, 2025

PDF Project Code

BibTeX

@inproceedings{li2025image,
title={{Image conductor}: Precision control for interactive video synthesis},
author={Li, Yaowei and Wang, Xintao and Zhang, Zhaoyang and Wang, Zhouxia and Yuan, Ziyang and Xie, Liangbin and Shan, Ying and Zou, Yuexian},
booktitle={AAAI},
year={2025}
}

2024

StyleAdapter: A Unified Stylized Image Generation Model

Zhouxia Wang, Xintao Wang, Liangbin Xie, Zhongang Qi, Ying Shan, Wenping Wang, and Ping Luo

IJCV, 2024

PDF

BibTeX

@article{wang2024styleadapter,
title={StyleAdapter: A Unified Stylized Image Generation Model},
author={Wang, Zhouxia and Wang, Xintao and Xie, Liangbin and Qi, Zhongang and Shan, Ying and Wang, Wenping and Luo, Ping},
journal={International Journal of Computer Vision},
pages={1--18},
year={2024},
publisher={Springer}
}

Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos

Zhouxia Wang, Jiawei Zhang, Xintao Wang, Tianshui Chen, Ying Shan, Wenping Wang, and Ping Luo

IEEE Transactions on Image Processing (TIP), 2024

PDF Project

BibTeX

@ARTICLE{wang2024analysis,
author={Wang, Zhouxia and Zhang, Jiawei and Wang, Xintao and Chen, Tianshui and Shan, Ying and Wang, Wenping and Luo, Ping},
journal={IEEE Transactions on Image Processing},
title={Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos},
year={2024},
volume={33},
pages={5676-5687},
doi={10.1109/TIP.2024.3463414}
}

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, and Ying Shan

SIGGRAPH Conference, 2024

PDF Project Code Demo 1 Demo 2

BibTeX

@inproceedings{wang2024motionctrl,
title={Motionctrl: A unified and flexible motion controller for video generation},
author={Wang, Zhouxia and Yuan, Ziyang and Wang, Xintao and Li, Yaowei and Chen, Tianshui and Xia, Menghan and Luo, Ping and Shan, Ying},
booktitle={ACM SIGGRAPH 2024 Conference Papers},
pages={1--11},
year={2024}
}

Diffusion-based Blind Text Image Super-Resolution

Yuzhe Zhang, Jiawei Zhang, Hao Li, Zhouxia Wang, Luwei Hou, Dongqing Zou, and Liheng Bian

CVPR, 2024

PDF Code

BibTeX

@inproceedings{zhang2024diffusion,
title={Diffusion-based Blind Text Image Super-Resolution},
author={Zhang, Yuzhe and Zhang, Jiawei and Li, Hao and Wang, Zhouxia and Hou, Luwei and Zou, Dongqing and Bian, Liheng},
booktitle={CVPR},
year={2024}
}

2023–2017

RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs

Zhouxia Wang, Jiawei Zhang, Tianshui Chen, Wenping Wang, and Ping Luo

TPAMI, 2023

PDF Code Demo

BibTeX

@article{wang2023restoreformer++,
title={RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Paris},
author={Wang, Zhouxia and Zhang, Jiawei and Chen, Tianshui and Wang, Wenping and Luo, Ping},
booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)},
year={2023}
}

RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value Pairs

Zhouxia Wang, Jiawei Zhang, Runjian Chen, Wenping Wang, and Ping Luo

CVPR, 2022

PDF Code Demo

BibTeX

@article{wang2022restoreformer,
title={RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value Pairs},
author={Wang, Zhouxia and Zhang, Jiawei and Chen, Runjian and Wang, Wenping and Luo, Ping},
booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}
}

Learning a Reinforced Agent for Flexible Exposure Bracketing Selection

Zhouxia Wang, Jiawei Zhang, Mude Lin, Jiong Wang, Ping Luo, and Jimmy Ren

CVPR, 2020

PDF Code

BibTeX

@inproceedings{Wang2020Learning,
title={Learning a Reinforced Agent for Flexible Exposure Bracketing Selection},
author={Zhouxia Wang, Jiawei Zhang, Mude Lin, Jiong Wang, Ping Luo, Jimmy Ren},
booktitle={CVPR},
year={2020}
}

LSTM Pose Machines

Yue Luo, Jimmy Ren, Zhouxia Wang, Wenxiu Sun, Jinshan Pan, Jianbo Liu, Jiahao Pang, and Liang Lin

CVPR, 2018

PDF Code

BibTeX

@inproceedings{Luo2018LSTMPose,
title={LSTM Pose Machines},
author={Yue Luo, Jimmy Ren, Zhouxia Wang, Wenxiu Sun, Jinshan Pan, Jianbo Liu, Jiahao Pang, Liang Lin},
booktitle={CVPR},
year={2018}
}

Deep Reasoning with Knowledge Graph for Social Relationship Understanding

Zhouxia Wang*, Tianshui Chen*, Jimmy Ren, Weihao Yu, Hui Cheng, and Liang Lin

IJCAI, 2018

PDF Code

BibTeX

@inproceedings{Wang2018Deep,
title={Deep Reasoning with Knowledge Graph for Social Relationship Understanding},
author={Zhouxia Wang, Tianshui Chen, Jimmy Ren, Weihao Yu, Hui Cheng, Liang Lin},
booktitle={International Joint Conference on Artificial Intelligence},
year={2018}
}

Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition

Tianshui Chen, Zhouxia Wang, Guanbin Li, and Liang Lin

AAAI, 2018

PDF

BibTeX

@inproceedings{chen2018recurrent,
title={Recurrent attentional reinforcement learning for multi-label image recognition},
author={Chen, Tianshui and Wang, Zhouxia and Li, Guanbin and Lin, Liang},
booktitle={Proceedings of the AAAI conference on artificial intelligence},
volume={32},
number={1},
year={2018}
}

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

Zhouxia Wang*, Tianshui Chen*, Guanbin Li, Ruijia Xu, and Liang Lin

ICCV, 2017

PDF

BibTeX

@inproceedings{wang2017multi,
title={Multi-label image recognition by recurrently discovering attentional regions},
author={Wang, Zhouxia and Chen, Tianshui and Li, Guanbin and Xu, Ruijia and Lin, Liang},
booktitle={Proceedings of the IEEE international conference on computer vision},
pages={464--472},
year={2017}
}

Academic Service

Conference Reviewer

CVPR, ICCV, NeurIPS, and related venues in computer vision and machine learning.

Journal Reviewer

TPAMI, TIP, IJCV, PR, and other journals in visual computing and pattern recognition.