基于CMDP的多智能体泛化性研究
首发时间:2023-01-10
摘要:本文针对多智能体任务泛化的问题,提出了一种基于上下文马尔可夫决策的多智能体强化学习算法,该算法设计了一个上下文信息模块,利用注意力机制实现了对上下文的表征,上下文信息包含了额外的任务相关信息,在多任务设定下,可以用来推断不同任务间关系,从而将历史任务上的经验进行保存和在新任务上进行利用,从而实现任务间泛化。此外,基于上下文信息设计了动作子空间,解决了空间爆炸问题,加速了智能体的高效训练。在多智能体粒子环境中与现有算法进行比较,实验结果表明该方法的智能体在其他任务上效果更优。
For information in English, please click here
Research on Generalization of Multi-Agent Based on CMDP
Abstract:Aiming at the problem of multi-agent task generalization, this paper proposes a multi-agent reinforcement learning algorithm based on context Markov decision. In this algorithm, a context information module is designed to realize the representation of context by using the attention mechanism. Context information contains additional task-related information, which can be used to infer the relationship between different tasks under the multi-task setting. In this way, the experience of historical tasks can be saved and utilized in new tasks, so as to realize the generalization between tasks. In addition, the action subspace is designed based on the context information, which solves the space explosion problem and accelerates the efficient training of the agent. Compared with the existing algorithm in the multi-agent particle environment, the experimental results show that the proposed method is more effective in other tasks.
Keywords: Reinforcement learning Multi-agent CMDP Generalization
引用
No.****
同行评议
勘误表
基于CMDP的多智能体泛化性研究
评论
全部评论