Skip to main content

2024 | OriginalPaper | Buchkapitel

Towards Cost-Efficient Federated Multi-agent RL with Learnable Aggregation

verfasst von : Yi Zhang, Sen Wang, Zhi Chen, Xuwei Xu, Stano Funiak, Jiajun Liu

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multi-agent reinforcement learning (MARL) often adopts centralized training with a decentralized execution (CTDE) framework to facilitate cooperation among agents. When it comes to deploying MARL algorithms in real-world scenarios, CTDE requires gradient transmission and parameter synchronization for each training step, which can incur disastrous communication overhead. To enhance communication efficiency, federated MARL is proposed to average the gradients periodically during communication. However, such straightforward averaging leads to poor coordination and slow convergence arising from the non-i.i.d. problem which is evidenced by our theoretical analysis. To address the two challenges, we propose a federated MARL framework, termed cost-efficient federated multi-agent reinforcement learning with learnable aggregation (FMRL-LA). Specifically, we use asynchronous critics to optimize communication efficiency by filtering out redundant local updates based on the estimation of agent utilities. A centralized aggregator rectifies these estimations conditioned on global information to improve cooperation and reduce non-i.i.d. impact by maximizing the composite system objectives. For a comprehensive evaluation, we extend a challenging multi-agent autonomous driving environment to the federated learning paradigm, comparing our method to competitive MARL baselines. Our findings indicate that FMRL-LA can adeptly balance performance and efficiency. Code and appendix can be found in https://​github.​com/​ArronDZhang/​FMRL_​LA.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abegaz, M., Erbad, A., Nahom, H., Albaseer, A., Abdallah, M., Guizani, M.: Multi-agent federated reinforcement learning for resource allocation in UAV-enabled internet of medical things networks. IoT-J (2023) Abegaz, M., Erbad, A., Nahom, H., Albaseer, A., Abdallah, M., Guizani, M.: Multi-agent federated reinforcement learning for resource allocation in UAV-enabled internet of medical things networks. IoT-J (2023)
2.
Zurück zum Zitat Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM (2018) Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM (2018)
3.
Zurück zum Zitat Chaudhuri, R., Mukherjee, K., Narayanam, R., Vallam, R.D.: Collaborative reinforcement learning framework to model evolution of cooperation in sequential social dilemmas. In: PAKDD (2021) Chaudhuri, R., Mukherjee, K., Narayanam, R., Vallam, R.D.: Collaborative reinforcement learning framework to model evolution of cooperation in sequential social dilemmas. In: PAKDD (2021)
4.
Zurück zum Zitat Chen, T., Zhang, K., Giannakis, G.B., Başar, T.: Communication-efficient policy gradient methods for distributed reinforcement learning. TCNS (2021) Chen, T., Zhang, K., Giannakis, G.B., Başar, T.: Communication-efficient policy gradient methods for distributed reinforcement learning. TCNS (2021)
5.
Zurück zum Zitat Christianos, F., Papoudakis, G., Rahman, A., Albrecht, S.V.: Scaling multi-agent reinforcement learning with selective parameter sharing. In: ICML (2021) Christianos, F., Papoudakis, G., Rahman, A., Albrecht, S.V.: Scaling multi-agent reinforcement learning with selective parameter sharing. In: ICML (2021)
6.
Zurück zum Zitat Du, X., Wang, J., Chen, S.: Multi-agent meta-reinforcement learning with coordination and reward shaping for traffic signal control. In: PAKDD (2023) Du, X., Wang, J., Chen, S.: Multi-agent meta-reinforcement learning with coordination and reward shaping for traffic signal control. In: PAKDD (2023)
7.
Zurück zum Zitat Foerster, J., Assael, I.A., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: NeurIPS (2016) Foerster, J., Assael, I.A., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: NeurIPS (2016)
8.
Zurück zum Zitat Hu, S., Zhu, F., Chang, X., Liang, X.: UPDeT: universal multi-agent reinforcement learning via policy decoupling with transformers. In: ICLR (2021) Hu, S., Zhu, F., Chang, X., Liang, X.: UPDeT: universal multi-agent reinforcement learning via policy decoupling with transformers. In: ICLR (2021)
9.
Zurück zum Zitat Jin, H., Peng, Y., Yang, W., Wang, S., Zhang, Z.: Federated reinforcement learning with environment heterogeneity. In: AISTATS (2022) Jin, H., Peng, Y., Yang, W., Wang, S., Zhang, Z.: Federated reinforcement learning with environment heterogeneity. In: AISTATS (2022)
10.
Zurück zum Zitat Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: stochastic controlled averaging for federated learning. In: ICML (2020) Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: stochastic controlled averaging for federated learning. In: ICML (2020)
11.
Zurück zum Zitat Khodadadian, S., Sharma, P., Joshi, G., Maguluri, S.T.: Federated reinforcement learning: linear speedup under Markovian sampling. In: ICML (2022) Khodadadian, S., Sharma, P., Joshi, G., Maguluri, S.T.: Federated reinforcement learning: linear speedup under Markovian sampling. In: ICML (2022)
12.
Zurück zum Zitat Kuba, J.G., Chen, R., Wen, M., Wen, Y., Sun, F., Wang, J., Yang, Y.: Trust region policy optimisation in multi-agent reinforcement learning. In: ICLR (2022) Kuba, J.G., Chen, R., Wen, M., Wen, Y., Sun, F., Wang, J., Yang, Y.: Trust region policy optimisation in multi-agent reinforcement learning. In: ICLR (2022)
13.
Zurück zum Zitat Li, Q., Peng, Z., Feng, L., Zhang, Q., Xue, Z., Zhou, B.: MetaDrive: composing diverse driving scenarios for generalizable reinforcement learning. TPAMI (2022) Li, Q., Peng, Z., Feng, L., Zhang, Q., Xue, Z., Zhou, B.: MetaDrive: composing diverse driving scenarios for generalizable reinforcement learning. TPAMI (2022)
14.
Zurück zum Zitat Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: MLSys (2020) Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: MLSys (2020)
15.
Zurück zum Zitat Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: NeurIPS (2017) Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: NeurIPS (2017)
16.
Zurück zum Zitat McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS (2017) McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS (2017)
17.
Zurück zum Zitat Mo, J., Xie, H.: A multi-player MAB approach for distributed selection problems. In: PAKDD (2023) Mo, J., Xie, H.: A multi-player MAB approach for distributed selection problems. In: PAKDD (2023)
18.
Zurück zum Zitat Pang, Y., Zhang, H., Deng, J.D., Peng, L., Teng, F.: Rule-based collaborative learning with heterogeneous local learning models. In: PAKDD (2022) Pang, Y., Zhang, H., Deng, J.D., Peng, L., Teng, F.: Rule-based collaborative learning with heterogeneous local learning models. In: PAKDD (2022)
19.
Zurück zum Zitat Peng, Z., Hui, K.M., Liu, C., Zhou, B.: Learning to simulate self-driven particles system with coordinated policy optimization. In: NeurIPS (2021) Peng, Z., Hui, K.M., Liu, C., Zhou, B.: Learning to simulate self-driven particles system with coordinated policy optimization. In: NeurIPS (2021)
20.
Zurück zum Zitat Pinto Neto, E.C., Sadeghi, S., Zhang, X., Dadkhah, S.: Federated reinforcement learning in IoT: applications, opportunities and open challenges. Appl. Sci. (2023) Pinto Neto, E.C., Sadeghi, S., Zhang, X., Dadkhah, S.: Federated reinforcement learning in IoT: applications, opportunities and open challenges. Appl. Sci. (2023)
21.
Zurück zum Zitat Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. JMLR (2020) Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. JMLR (2020)
22.
Zurück zum Zitat Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:1707.06347 (2017) Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:​1707.​06347 (2017)
23.
Zurück zum Zitat Song, Y., Chang, H.H., Liu, L.: Federated dynamic spectrum access through multi-agent deep reinforcement learning. In: GLOBECOM (2022) Song, Y., Chang, H.H., Liu, L.: Federated dynamic spectrum access through multi-agent deep reinforcement learning. In: GLOBECOM (2022)
25.
Zurück zum Zitat Wang, J., Joshi, G.: Cooperative SGD: a unified framework for the design and analysis of local-update SGD algorithms. JMLR (2021) Wang, J., Joshi, G.: Cooperative SGD: a unified framework for the design and analysis of local-update SGD algorithms. JMLR (2021)
26.
Zurück zum Zitat Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. In: NeurIPS (2020) Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. In: NeurIPS (2020)
27.
Zurück zum Zitat Wen, M., et al.: Multi-agent reinforcement learning is a sequence modeling problem. Front. Comput. Sci. (2022) Wen, M., et al.: Multi-agent reinforcement learning is a sequence modeling problem. Front. Comput. Sci. (2022)
28.
29.
Zurück zum Zitat Xu, X., Li, R., Zhao, Z., Zhang, H.: The gradient convergence bound of federated multi-agent reinforcement learning with efficient communication. TWC (2023) Xu, X., Li, R., Zhao, Z., Zhang, H.: The gradient convergence bound of federated multi-agent reinforcement learning with efficient communication. TWC (2023)
30.
Zurück zum Zitat Yu, C., Velu, A., Vinitsky, E., Gao, J., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of PPO in cooperative multi-agent games. In: NeurIPS (2022) Yu, C., Velu, A., Vinitsky, E., Gao, J., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of PPO in cooperative multi-agent games. In: NeurIPS (2022)
31.
Zurück zum Zitat Zhou, X., Matsubara, S., Liu, Y., Liu, Q.: Bribery in rating systems: a game-theoretic perspective. In: PAKDD (2022) Zhou, X., Matsubara, S., Liu, Y., Liu, Q.: Bribery in rating systems: a game-theoretic perspective. In: PAKDD (2022)
Metadaten
Titel
Towards Cost-Efficient Federated Multi-agent RL with Learnable Aggregation
verfasst von
Yi Zhang
Sen Wang
Zhi Chen
Xuwei Xu
Stano Funiak
Jiajun Liu
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2253-2_14

Premium Partner