Publications

This page features a selection of my publications. For a more comprehensive list of publications, please refer to my Google Scholar profile, Semantic Scholar profile, Computer Science Bibliography, or ACL Anthology.

CL-bench: A Benchmark for Context Learning

Published in arXiv, 2026

Current language models (LMs) excel at reasoning over prompts using pre-trained knowledge. However, real-world tasks are far more complex and context-dependent: models must learn from task-specific context and leverage new knowledge beyond what is learned during pre-training to reason and resolve tasks. We term this capability context learning, a crucial ability that humans naturally possess but has been largely overlooked.

Recommended citation: Shihan Dou, Ming Zhang, Zhangyue Yin, Chenhao Huang, Yujiong Shen, Junzhe Wang, Jiayi Chen, Yuchen Ni, Junjie Ye, Cheng Zhang, Huaibing Xie, Jianglu Hu, Shaolei Wang, Weichao Wang, Yanling Xiao, Yiting Liu, Zenan Xu, Zhen Guo, Pluto Zhou, Tao Gui, Zuxuan Wu, Xipeng Qiu, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Di Wang, Shunyu Yao. CL-bench: A Benchmark for Context Learning. arXiv:2602.03587, 2026. https://arxiv.org/abs/2602.03587

Revealing emergent human-like conceptual representations from language prediction

Published in Proceedings of the National Academy of Sciences (PNAS), 2025

People acquire concepts through rich physical and social experiences and use them to understand and navigate the world. In contrast, large language models (LLMs), trained solely through next-token prediction on text, exhibit strikingly human-like behaviors. Are these models developing concepts akin to those in humans?

Recommended citation: Ningyu Xu, Qi Zhang, Chenyang Du, Qinan Luo, Xipeng Qiu, Xuanjing Huang, Menghan Zhang. Revealing emergent human-like conceptual representations from language prediction. Proceedings of the National Academy of Sciences (PNAS), 2025. https://doi.org/10.1073/pnas.2512514122

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Published in arXiv, 2025

We propose AgentGym-RL, a reinforcement learning framework for training large language model (LLM)-based agents to tackle long-horizon decision-making tasks through multi-turn interactions.

Recommended citation: Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Xin Guo, Dingwen Yang, Chenyang Liao, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang. AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning. arXiv:2509.08755, 2025. https://arxiv.org/abs/2509.08755

AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments

Published in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Large language models (LLMs) have emerged as a promising foundation to build generally-capable agents (LLM-based agents) that can handle multi-turn decision-making tasks across various environments. However, the community lacks a unified interactive framework that covers diverse environments for comprehensive evaluation of agents, and enables exploration and learning for their self-improvement.

Recommended citation: Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Xin Guo, Dingwen Yang, Chenyang Liao, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang. AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 27914–27961, Vienna, Austria. Association for Computational Linguistics, 2025. https://aclanthology.org/2025.acl-long.1355/

Large Language Models: From Theory to Practice (2nd Edition)

Published in Electronic Industry Press, 2025

This book introduces the fundamental theories of large language models, including language modeling, distributed model training, and reinforcement learning, with practical examples using the Deepspeed-Chat framework to implement large language models and ChatGPT-like systems.

Recommended citation: Qi Zhang, Tao Gui, Rui Zheng, Xuanjing Huang: Large Language Models: From Theory to Practice (2nd Edition), Electronic Industry Press, 2025 https://intro-llm.github.io/

The Rise and Potential of Large Language Model Based Agents: A Survey

Published in Science China Information Sciences, 2025

In this paper, we perform a comprehensive survey on LLM-based agents.

Recommended citation: Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Qi Zhang, Tao Gui: The rise and potential of large language model based agents: a survey. Sci. China Inf. Sci. 68(2) (2025) https://link.springer.com/article/10.1007/s11432-024-4222-0

Searching for Best Practices in Retrieval-Augmented Generation

Published in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolonged response times.

Recommended citation: Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang. Searching for Best Practices in Retrieval-Augmented Generation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17716–17736. Association for Computational Linguistics, 2024. https://aclanthology.org/2024.emnlp-main.981/

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Published in CoRR abs/2401.06080, 2024

From a data perspective, we propose a method to measure the strength of preferences within the data, based on a voting mechanism of multiple reward models. From an algorithmic standpoint, we introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses, thereby improving model generalization.

Recommended citation: Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang: Secrets of RLHF in Large Language Models Part II: Reward Modeling. CoRR abs/2401.06080 (2024) http://xuanjing-huang.github.io/files/reward.pdf

MouSi: Poly-Visual-Expert Vision-Language Models

Published in CoRR abs/2401.17221, 2024

This paper proposes the use of ensemble experts technique to synergizes the capabilities of individual visual encoders, including those skilled in image-text matching, OCR, image segmentation, etc.

Recommended citation: Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang: MouSi: Poly-Visual-Expert Vision-Language Models. CoRR abs/2401.17221 (2024) http://xuanjing-huang.github.io/files/mousi.pdf

Introduction to Natural Language Processing

Published in Electronic Industry Press, 2023

With the widespread application of natural language processing and the rapid advancement of machine learning algorithms represented by deep learning, natural language processing algorithms and research tasks have been developing rapidly in recent years. Since 2003, the authors have taught natural language processing courses for undergraduates, master students, and doctoral students at the School of Computer Science and Technology, Fudan University. This book summarizes years of teaching and research, aiming to provide readers with a more systematic and comprehensive understanding of natural language processing.

Recommended citation: Qi Zhang, Tao Gui, Xuanjing Huang: Introduction to Natural Language Processing, Electronic Industry Press, 2023 https://intro-nlp.github.io/

Secrets of RLHF in Large Language Models Part I: PPO

Published in CoRR abs/2307.04964, 2023

We dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training.

Recommended citation: Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang: Secrets of RLHF in Large Language Models Part I: PPO. CoRR abs/2307.04964 (2023) http://xuanjing-huang.github.io/files/rlhf.pdf

A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck

Published in Proceedings of the 29th International Conference on Computational Linguistics, 2022

In this paper, we propose a multi-format transfer learning model with variational information bottleneck for EAE in new datasets.

Recommended citation: Jie Zhou, Qi Zhang, Qin Chen, Liang He, Xuanjing Huang: A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck. COLING 2022: 1990-2000 http://xuanjing-huang.github.io/files/mft.pdf

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters

Published in Findings of the Association for Computational Linguistics: ACL-IJCNLP, 2021

The paper proposes a framework that retains the original parameters of the pre-trained model fixed and supports the development of versatile knowledge-infused model.

Recommended citation: Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Jianshu Ji, Guihong Cao, Daxin Jiang, Ming Zhou: K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. ACL/IJCNLP (Findings) 2021: 1405-1418 http://xuanjing-huang.github.io/files/K-Adapter.pdf

Extractive Summarization as Text Matching

Published in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.

Recommended citation: Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang: Extractive Summarization as Text Matching. ACL 2020: 6197-6208 http://xuanjing-huang.github.io/files/ext.pdf

Simplify the Usage of Lexicon in Chinese NER

Published in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

In this work, we propose a simple but effective method for incorporating the word lexicon into the character representations.

Recommended citation: Ruotian Ma, Minlong Peng, Qi Zhang, Zhongyu Wei, Xuanjing Huang: Simplify the Usage of Lexicon in Chinese NER. ACL 2020: 5951-5960 http://xuanjing-huang.github.io/files/Simplify.pdf

FLAT: Chinese NER Using Flat-Lattice Transformer

Published in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

In this paper, we propose FLAT: Flat-LAttice Transformer for Chinese NER, which converts the lattice structure into a flat structure consisting of spans.

Recommended citation: Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang: FLAT: Chinese NER Using Flat-Lattice Transformer. ACL 2020: 6836-6842 http://xuanjing-huang.github.io/files/FLAT.pdf

A Lexicon-Based Graph Neural Network for Chinese NER

Published in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019

In this work, we introduce a lexicon-based graph neural network with global semantics for Chinese NER.

Recommended citation: Tao Gui, Yicheng Zou, Qi Zhang, Minlong Peng, Jinlan Fu, Zhongyu Wei, Xuanjing Huang: A Lexicon-Based Graph Neural Network for Chinese NER. EMNLP/IJCNLP (1) 2019: 1040-1050 http://xuanjing-huang.github.io/files/ALB.pdf

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

Published in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

In this paper, we propose adversarial multi-criteria learning for CWS by integrating shared knowledge from multiple heterogeneous segmentation criteria.

Recommended citation: Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang: Adversarial Multi-Criteria Learning for Chinese Word Segmentation. ACL (1) 2017: 1193-1203 http://xuanjing-huang.github.io/files/cws.pdf

Adversarial Multi-task Learning for Text Classification

Published in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

The paper proposed an adversarial multi-task learning framework, alleviating the shared and private latent feature spaces from interfering with each other.

Recommended citation: Pengfei Liu, Xipeng Qiu, Xuanjing Huang: Adversarial Multi-task Learning for Text Classification. ACL (1) 2017: 1-10 http://xuanjing-huang.github.io/files/AMT.pdf