AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress
Published in The ACM Web Conference 2026 (WWW 2026), 2026
We propose AgentPRM, a re-defined process reward model for LLM agent tasks that captures both the interdependence between sequential decisions and their contribution to the final goal, enabling better progress tracking and exploration-exploitation balance.
Recommended citation: Zhiheng Xi, Chenyang Liao, Guanyu Li, Yajie Yang, Wenxiang Chen, Zhihao Zhang, Bing Wang, Senjie Jin, Yuhao Zhou, Jian Guan, Wei Wu, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang. AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress. In Proceedings of the ACM Web Conference 2026 (WWW 2026). https://arxiv.org/abs/2511.08325
