Zhouxia Wang

I am currently a Tenure-Track Associate Professor with the HCP at the School of Computer Science and Engineering, Sun Yat-sen University. Previously, I was a Research Fellow at MMLab@NTU, Nanyang Technological University, Singapore, where I worked with Prof. Chen Change Loy. I received my Ph.D. from MMLab@HKU at The University of Hong Kong, advised by Prof. Ping Luo and Prof. Wenping Wang. I obtained both my master's and bachelor's degrees from Sun Yat-sen University under the supervision of Prof. Liang Lin.

Research Areas

  • Controllable image and video generation
  • Image and video restoration
  • Embodied AI and active perception
  • Visual Understanding

Selected projects and representative directions

A quick entry point into recent work on camera control, motion control, restoration, and robot manipulation.

ObjCtrl-2.5D preview
IJCV 2026

ObjCtrl-2.5D

Training-free object control with explicit camera pose manipulation for flexible image generation.

MotionCtrl preview
SIGGRAPH 2024

MotionCtrl

A unified and flexible motion controller for video generation across multiple control signals.

RestoreFormer++ preview
CVPR 2022/TPAMI 2023/TIP 2024

ResotreFormer/++

Blind face restoration from undegraded key-value pairs, designed for challenging real-world scenarios.

Learning to See and Act preview
CVPR 2026

Learning to See and Act

Task-aware virtual view exploration for robotic manipulation with embodied visual perception.

Selected publications

Representative papers across controllable generation, restoration, and visual understanding. See the full publication list on Google Scholar.

2026

Learning to See and Act thumbnail

Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation

Yongjie Bai*, Zhouxia Wang*, Yang Liu, Kaijun Luo, Yifan Wen, Mingtong Dai, Weixing Chen, Ziliang Chen, Lingbo Liu, Guanbin Li, Liang Lin

CVPR, 2026

* indicates equal contribution.

BibTeX
@inproceedings{bai2026learning,
title={Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation},
author={Bai, Yongjie and Wang, Zhouxia and Liu, Yang and Luo, Kaijun and Wen, Yifan and Dai, Mingtong and Chen, Weixing and Chen, Ziliang and Liu, Lingbo and Li, Guanbin and Lin, Liang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}
Next Visual Granularity Generation thumbnail

Next Visual Granularity Generation

Yikai Wang, Zhouxia Wang, Zhonghua Wu, Qingyi Tao, Kang Liao, Chen Change Loy

ICLR, 2026

BibTeX
@inproceedings{wang2026nvg,
title={Next Visual Granularity Generation},
author={Wang, Yikai and Wang, Zhouxia and Wu, Zhonghua and Tao, Qingyi and Liao, Kang and Loy, Chen Change},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}
ObjCtrl-2.5D thumbnail

ObjCtrl-2.5D: Training-free Object Control with Camera Poses

Zhouxia Wang, Yushi Lan, Shangchen Zhou, and Chen Change Loy

International Journal of Computer Vision (IJCV), 2026

BibTeX
@inproceedings{objctrl2.5d,
title={{ObjCtrl-2.5D}: Training-free Object Control with Camera Poses},
author={Wang, Zhouxia and Lan, Yushi and Zhou, Shangchen and Loy, Chen Change},
booktitle={IJCV},
year={2026}
}
FreeTraj thumbnail

FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models

Haonan Qiu, Zhaoxi Chen, Zhouxia Wang, Yingqing He, Menghan Xia, Ziwei Liu

International Journal of Computer Vision (IJCV), 2026

BibTeX
@misc{qiu2024freetraj,
title={FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models},
author={Haonan Qiu and Zhaoxi Chen and Zhouxia Wang and Yingqing He and Menghan Xia and Ziwei Liu},
year={2026},
booktitle={IJCV},
}
ObjectClear thumbnail

ObjectClear: Precise Object and Effect Removal with Adaptive Target-Aware Attention

Jixin Zhao, Zhouxia Wang, Peiqing Yang, Shangchen Zhou

CVPR, 2026

BibTeX
@inproceedings{zhao2025ObjectClear,
title={{ObjectClear}: Complete Object Removal via Object-Effect Attention},
author={Zhao, Jixin and Zhou, Shangchen and Wang, Zhouxia and Yang, Peiqing and Loy, Chen Change},
booktitle={CVPR},
year={2026}
}

2025

Denoising as Adaptation thumbnail

Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration

Kang Liao, Zongsheng Yue, Zhouxia Wang, Chen Change Loy

ICLR, 2025

BibTeX
@article{liao2024denoising,
title={Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration},
author={Liao, Kang and Yue, Zongsheng and Wang, Zhouxia and Loy, Chen Change},
journal={ICLR},
year={2025}
}
Image Conductor thumbnail

Image Conductor: Precision Control for Interactive Video Synthesis

Yaowei Li, Xintao Wang, Zhaoyang Zhang, Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, and Ying Shan

AAAI, 2025

BibTeX
@inproceedings{li2025image,
title={{Image conductor}: Precision control for interactive video synthesis},
author={Li, Yaowei and Wang, Xintao and Zhang, Zhaoyang and Wang, Zhouxia and Yuan, Ziyang and Xie, Liangbin and Shan, Ying and Zou, Yuexian},
booktitle={AAAI},
year={2025}
}

2024

StyleAdapter thumbnail

StyleAdapter: A Unified Stylized Image Generation Model

Zhouxia Wang, Xintao Wang, Liangbin Xie, Zhongang Qi, Ying Shan, Wenping Wang, and Ping Luo

IJCV, 2024

BibTeX
@article{wang2024styleadapter,
title={StyleAdapter: A Unified Stylized Image Generation Model},
author={Wang, Zhouxia and Wang, Xintao and Xie, Liangbin and Qi, Zhongang and Shan, Ying and Wang, Wenping and Luo, Ping},
journal={International Journal of Computer Vision},
pages={1--18},
year={2024},
publisher={Springer}
}
TIP 2024 thumbnail

Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos

Zhouxia Wang, Jiawei Zhang, Xintao Wang, Tianshui Chen, Ying Shan, Wenping Wang, and Ping Luo

IEEE Transactions on Image Processing (TIP), 2024

BibTeX
@ARTICLE{wang2024analysis,
author={Wang, Zhouxia and Zhang, Jiawei and Wang, Xintao and Chen, Tianshui and Shan, Ying and Wang, Wenping and Luo, Ping},
journal={IEEE Transactions on Image Processing},
title={Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos},
year={2024},
volume={33},
pages={5676-5687},
doi={10.1109/TIP.2024.3463414}
}
MotionCtrl thumbnail

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, and Ying Shan

SIGGRAPH Conference, 2024

BibTeX
@inproceedings{wang2024motionctrl,
title={Motionctrl: A unified and flexible motion controller for video generation},
author={Wang, Zhouxia and Yuan, Ziyang and Wang, Xintao and Li, Yaowei and Chen, Tianshui and Xia, Menghan and Luo, Ping and Shan, Ying},
booktitle={ACM SIGGRAPH 2024 Conference Papers},
pages={1--11},
year={2024}
}
DiffTSR thumbnail

Diffusion-based Blind Text Image Super-Resolution

Yuzhe Zhang, Jiawei Zhang, Hao Li, Zhouxia Wang, Luwei Hou, Dongqing Zou, and Liheng Bian

CVPR, 2024

BibTeX
@inproceedings{zhang2024diffusion,
title={Diffusion-based Blind Text Image Super-Resolution},
author={Zhang, Yuzhe and Zhang, Jiawei and Li, Hao and Wang, Zhouxia and Hou, Luwei and Zou, Dongqing and Bian, Liheng},
booktitle={CVPR},
year={2024}
}

2023–2017

RestoreFormer++ thumbnail

RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs

Zhouxia Wang, Jiawei Zhang, Tianshui Chen, Wenping Wang, and Ping Luo

TPAMI, 2023

BibTeX
@article{wang2023restoreformer++,
title={RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Paris},
author={Wang, Zhouxia and Zhang, Jiawei and Chen, Tianshui and Wang, Wenping and Luo, Ping},
booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)},
year={2023}
}
RestoreFormer thumbnail

RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value Pairs

Zhouxia Wang, Jiawei Zhang, Runjian Chen, Wenping Wang, and Ping Luo

CVPR, 2022

BibTeX
@article{wang2022restoreformer,
title={RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value Pairs},
author={Wang, Zhouxia and Zhang, Jiawei and Chen, Runjian and Wang, Wenping and Luo, Ping},
booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}
}
Exposure Bracketing thumbnail

Learning a Reinforced Agent for Flexible Exposure Bracketing Selection

Zhouxia Wang, Jiawei Zhang, Mude Lin, Jiong Wang, Ping Luo, and Jimmy Ren

CVPR, 2020

BibTeX
@inproceedings{Wang2020Learning,
title={Learning a Reinforced Agent for Flexible Exposure Bracketing Selection},
author={Zhouxia Wang, Jiawei Zhang, Mude Lin, Jiong Wang, Ping Luo, Jimmy Ren},
booktitle={CVPR},
year={2020}
}
LSTM Pose Machines thumbnail

LSTM Pose Machines

Yue Luo, Jimmy Ren, Zhouxia Wang, Wenxiu Sun, Jinshan Pan, Jianbo Liu, Jiahao Pang, and Liang Lin

CVPR, 2018

BibTeX
@inproceedings{Luo2018LSTMPose,
title={LSTM Pose Machines},
author={Yue Luo, Jimmy Ren, Zhouxia Wang, Wenxiu Sun, Jinshan Pan, Jianbo Liu, Jiahao Pang, Liang Lin},
booktitle={CVPR},
year={2018}
}
Knowledge graph relationship thumbnail

Deep Reasoning with Knowledge Graph for Social Relationship Understanding

Zhouxia Wang*, Tianshui Chen*, Jimmy Ren, Weihao Yu, Hui Cheng, and Liang Lin

IJCAI, 2018

BibTeX
@inproceedings{Wang2018Deep,
title={Deep Reasoning with Knowledge Graph for Social Relationship Understanding},
author={Zhouxia Wang, Tianshui Chen, Jimmy Ren, Weihao Yu, Hui Cheng, Liang Lin},
booktitle={International Joint Conference on Artificial Intelligence},
year={2018}
}
Multi-label recognition thumbnail

Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition

Tianshui Chen, Zhouxia Wang, Guanbin Li, and Liang Lin

AAAI, 2018

BibTeX
@inproceedings{chen2018recurrent,
title={Recurrent attentional reinforcement learning for multi-label image recognition},
author={Chen, Tianshui and Wang, Zhouxia and Li, Guanbin and Lin, Liang},
booktitle={Proceedings of the AAAI conference on artificial intelligence},
volume={32},
number={1},
year={2018}
}
ICCV 2017 thumbnail

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

Zhouxia Wang*, Tianshui Chen*, Guanbin Li, Ruijia Xu, and Liang Lin

ICCV, 2017

BibTeX
@inproceedings{wang2017multi,
title={Multi-label image recognition by recurrently discovering attentional regions},
author={Wang, Zhouxia and Chen, Tianshui and Li, Guanbin and Xu, Ruijia and Lin, Liang},
booktitle={Proceedings of the IEEE international conference on computer vision},
pages={464--472},
year={2017}
}

Conference Reviewer

CVPR, ICCV, NeurIPS, and related venues in computer vision and machine learning.

Journal Reviewer

TPAMI, TIP, IJCV, PR, and other journals in visual computing and pattern recognition.