TY - JOUR
T1 - Reinforcement Learning with Selective Exploration for Interference Management in MmWave Networks
AU - Dinh-Van, Son
AU - Nguyen, Van-Linh
AU - Bulut Cebecioglu, Berna
AU - Masaracchia, Antonino
AU - Higgins, Matthew D.
PY - 2025/2/3
Y1 - 2025/2/3
N2 - The next generation of wireless systems will leverage the millimeter-wave (mmWave) bands to meet the increasing traffic volume and high data rate requirements of emerging applications (e.g., ultra HD streaming, metaverse, and holographic telepresence). In this paper, we address the joint optimization of beamforming, power control, and interference management in multi-cell mmWave networks. We propose novel reinforcement learning algorithms, including a single-agent-based method (BPC-SA) for centralized settings and a multi-agent-based method (BPC-MA) for distributed settings. To tackle the high-variance rewards caused by narrow antenna beamwidths, we introduce a selective exploration method to guide the agent towards more intelligent exploration. Our proposed algorithms are well-suited for scenarios where beamforming vectors require control in either a discrete domain, such as a codebook, or in a continuous domain. Furthermore, they do not require channel state information, extensive feedback from user equipments, or any searching methods, thus reducing overhead and enhancing scalability. Numerical results demonstrate that selective exploration improves per-user spectral efficiency by up to 22.5% compared to scenarios without it. Additionally, our algorithms significantly outperform existing methods by 50% in terms of per-user spectral effciency and achieve 90% of the per-user spectral efficiency of the exhaustive search approach while requiring only 0.1% of its computational runtime.
AB - The next generation of wireless systems will leverage the millimeter-wave (mmWave) bands to meet the increasing traffic volume and high data rate requirements of emerging applications (e.g., ultra HD streaming, metaverse, and holographic telepresence). In this paper, we address the joint optimization of beamforming, power control, and interference management in multi-cell mmWave networks. We propose novel reinforcement learning algorithms, including a single-agent-based method (BPC-SA) for centralized settings and a multi-agent-based method (BPC-MA) for distributed settings. To tackle the high-variance rewards caused by narrow antenna beamwidths, we introduce a selective exploration method to guide the agent towards more intelligent exploration. Our proposed algorithms are well-suited for scenarios where beamforming vectors require control in either a discrete domain, such as a codebook, or in a continuous domain. Furthermore, they do not require channel state information, extensive feedback from user equipments, or any searching methods, thus reducing overhead and enhancing scalability. Numerical results demonstrate that selective exploration improves per-user spectral efficiency by up to 22.5% compared to scenarios without it. Additionally, our algorithms significantly outperform existing methods by 50% in terms of per-user spectral effciency and achieve 90% of the per-user spectral efficiency of the exhaustive search approach while requiring only 0.1% of its computational runtime.
UR - https://www.open-access.bcu.ac.uk/16493/
U2 - 10.1109/TMLCN.2025.3537967
DO - 10.1109/TMLCN.2025.3537967
M3 - Article
VL - 3
SP - 280
EP - 295
JO - IEEE Transactions on Machine Learning in Communications and Networking
JF - IEEE Transactions on Machine Learning in Communications and Networking
ER -