SMIX(λ): Enhancing centralized value functions for cooperative multi-agent reinforcement learning

Chao Wen*, Xinghu Yao*, Yuhui Wang, Xiaoyang Tan

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    14 Scopus citations

    Abstract

    This work presents a sample efficient and effective value-based method, named SMIX(λ), for reinforcement learning in multi-agent environments (MARL) within the paradigm of centralized training with decentralized execution (CTDE), in which learning a stable and generalizable centralized value function (CVF) is crucial. To achieve this, our method carefully combines different elements, including 1) removing the unrealistic centralized greedy assumption during the learning phase, 2) using the λ-return to balance the trade-off between bias and variance and to deal with the environment’s non-Markovian property, and 3) adopting an experience-replay style off-policy training. Interestingly, it is revealed that there exists inherent connection between SMIX(λ) and previous off-policy Q(λ) approach for single-agent learning. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark show that the proposed SMIX(λ) algorithm outperforms several state-of-the-art MARL methods by a large margin, and that it can be used as a general tool to improve the overall performance of a CTDE-type method by enhancing the evaluation quality of its CVF. We open-source our code at: https://github.com/chaovven/SMIX.

    Original languageEnglish (US)
    Title of host publicationAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
    PublisherAAAI press
    Pages7301-7308
    Number of pages8
    ISBN (Electronic)9781577358350
    StatePublished - 2020
    Event34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States
    Duration: Feb 7 2020Feb 12 2020

    Publication series

    NameAAAI 2020 - 34th AAAI Conference on Artificial Intelligence

    Conference

    Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
    Country/TerritoryUnited States
    CityNew York
    Period02/7/2002/12/20

    ASJC Scopus subject areas

    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'SMIX(λ): Enhancing centralized value functions for cooperative multi-agent reinforcement learning'. Together they form a unique fingerprint.

    Cite this