Reinforcement learning for energy efficiency improvement in UAV-BS access networks: A knowledge transfer scheme

Zhiqun Hu, Yujing Zhang, Hao Huang, Xiangming Wen, Obinna Agbodike, Jenhui Chen*

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

14 Scopus citations

Abstract

Recently the possibility of forming unmanned aerial vehicle base station (UAV-BS) network systems with energy harvesting capabilities to support persistent wireless access services for pedestrian users has been validated. Due to the need of sustaining wireless access services of the UAV-BSs, we investigate an optimal policy to maximize the overall energy utilization efficiency (renewable energy) of the UAV-BSs during their active in-flight network access operations. Since the natural sources of renewable energy (e.g., solar energy or wind energy harvesting) have stochastic properties with respect to the arrival rate of the dynamics of the unknown environment, we exploit an actor–critic reinforcement learning framework, which considers the continuous-valued states and action space for learning the best policy during interaction with the environment. To enhance and expedite the learning process, a transfer asynchronous advantage actor–critic (TA3C) algorithm is proposed, which enables UAV-BSs to transfer (i.e., share) knowledge gained in historical periods, during parallel task asynchronous executions on multiple instances of the environment. Numerical results reveal that the proposed TA3C algorithm surpasses the classic A3C and A2C algorithms in terms of throughput and optimal energy utilization efficiency.

Original languageEnglish
Article number105930
JournalEngineering Applications of Artificial Intelligence
Volume120
DOIs
StatePublished - 04 2023

Bibliographical note

Publisher Copyright:
© 2023 Elsevier Ltd

Keywords

  • Algorithm
  • Energy harvesting
  • NP-hard
  • Reinforcement learning
  • Resource management
  • UAV

Fingerprint

Dive into the research topics of 'Reinforcement learning for energy efficiency improvement in UAV-BS access networks: A knowledge transfer scheme'. Together they form a unique fingerprint.

Cite this