Publications

    #LONG Group Members

    2025

    Taking A Closer Look at Interacting Objects: Interaction-Aware Open Vocabulary Scene Graph Generation
    Lin Li#, Chuhan Zhang, Dong Zhang, Chong Sun, Chen Li, and Long Chen#.
    ArXiv Preprint
    IterIS: Iterative Inference-Solving Alignment for LoRA Merging
    Hongxu Chen#, Runshi Li, Bowei Zhu, Zhen Wang#, and Long Chen#.
    Computer Vision and Pattern Recognition (CVPR)
    Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning
    Jinpeng Wang#, Tianci Luo, Yaohua Zha, Yan Feng, Ruisheng Luo, Bin Chen, Tao Dai, Long Chen#, Yaowei Wang, and Shu-Tao Xia.
    Computer Vision and Pattern Recognition (CVPR)
    Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification
    Yanghao Wang#, and Long Chen#.
    Computer Vision and Pattern Recognition (CVPR)
    CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
    Wei Chen#, Lin Li#, Yongqi Yang, Bin Wen, Fan Yang, Tingting Gao, Yu Wu, and Long Chen#.
    Computer Vision and Pattern Recognition (CVPR)
    DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
    Hongxiang Li#, Yaowei Li, Yuhang Yang, Junjie Gao, Zhihong Zhu, Xuxin Cheng, and Long Chen#.
    International Conference on Learning Representations (ICLR)
    CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
    Ziqi Jiang#, Zhen Wang#, and Long Chen#.
    International Conference on Learning Representations (ICLR)

    2024

    Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
    Kaifeng Gao#, Jiaxin Shi, Hanwang Zhang, Chunping Wang, Jun Xiao, and Long Chen#.
    ArXiv Preprint
    Event-Customized Image Generation
    Zhen Wang#, Yilei Jiang, Dong Zheng, Jun Xiao, and Long Chen#.
    ArXiv Preprint
    A Survey on Multimodal Benchmarks: In the Era of Large AI Models
    Lin Li#, Guikun Chen, Hanrong Shi, Jun Xiao, and Long Chen#.
    ArXiv Preprint
    FreeTuner: Any Subject in Any Style with Training-free Diffusion
    Youcan Xu*, Zhen Wang*#, Jun Xiao, Wei Liu, and Long Chen#.
    ArXiv Preprint
    Compositional Zero-shot Learning via Progressive Language-based Observations
    Lin Li#, Guikun Chen, Jun Xiao, and Long Chen#.
    ArXiv Preprint
    DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
    Zhen Wang#, Xinyun Jiang, Jun Xiao, Tao Chen, and Long Chen#.
    European Conference on Computer Vision (ECCV)
    An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
    Wei Chen#, Long Chen#, and Yu Wu.
    European Conference on Computer Vision (ECCV)
    Learning Combinatorial Prompts for Universal Controllable Image Captioning
    Zhen Wang#, Jun Xiao, Yueting Zhuang, Fei Gao, Jian Shao, and Long Chen#
    International Journal of Computer Vision (IJCV)
    From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation
    Hanrong Shi*, Lin Li*#, Jun Xiao, Yueting Zhuang, and Long Chen#.
    International Journal of Computer Vision (IJCV)
    A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
    Chaoyang Zhu#, and Long Chen#.
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
    NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation
    Lin Li#, Jun Xiao, Hanrong Shi, Hanwang Zhang, Yi Yang, Wei Liu, and Long Chen#
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

    2023

    Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
    Lin Li#, Jun Xiao, Guikun Chen, Jian Shao, Yueting Zhuang, and Long Chen#.
    Neural Information Processing Systems (NeurIPS)
    Compositional Feature Augmentation for Unbiased Scene Graph Generation
    Lin Li#, Guikun Chen, Jun Xiao, Yi Yang, Chunping Wang, and Long Chen#.
    International Conference on Computer Vision (ICCV)
    Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering
    Long Chen*#, Yuhang Zheng*, Yulei Niu, Hanwang Zhang, and Jun Xiao.
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)