Dynamic Multi-Objective Recommendation Via Discrete Soft Actor-Critic
Abstract
Modern recommender systems often need to optimize multiple conflicting objectives, such as accuracy, diversity, and novelty. This paper proposes a novel approach, Dynamic Multi-Objective Recommendation via Discrete Soft Actor-Critic (DMOR-DSAC), to address this challenge. DMOR-DSAC employs a reinforcement learning framework with a discrete action space, utilizing the Soft Actor-Critic algorithm to learn a policy that dynamically balances these objectives. Experimental results on benchmark datasets demonstrate the effectiveness of DMOR-DSAC in achieving superior multi-objective performance compared to existing methods.
Keywords
Multi-objective recommendation, reinforcement learning, Soft Actor-CriticHow to Cite
Downloads
References
Afsar MM, Crump T, Far B (2022) Reinforcement learning based recommender systems: a survey. ACM Comput Surv 55(7):1–38
Alharbe N, Rakrouki MA, Aljohani A (2023) A collaborative filtering recommendation algorithm based on embedding representation. Expert Syst Appl 215:119380
Badia AP, Sprechmann P, Vitvitskyi A, et al. (2020) Never give up: learning directed exploration strategies. arXiv:2002.06038
Cai X, Guo W, Zhao M et al (2023) A knowledge graph-based many-objective model for explainable social recommendation. IEEE Trans Comput Social Syst 10(6):3021–3030
Chen L, Zhu G, Liang W et al (2023) Multi-objective reinforcement learning approach for trip recommendation. Expert Syst Appl 226:120145
Chen H, Dai X, Cai H, et al. (2019a) Large-scale interactive recommendation with tree-structured policy gradient. In: Proceedings of the AAAI conference on artificial intelligence. pp 3312–3320
Chen X, Li S, Li H, et al. (2019b) Generative adversarial user model for reinforcement learning based recommendation system. In: International conference on machine learning. PMLR, pp 1052–1061
Christodoulou P (2019) Soft actor-critic for discrete action settings. arXiv:1910.07207
Cui L, Huang W, Yan Q et al (2018) A novel context-aware recommendation algorithm with two-level svd in social networks. Futur Gener Comput Syst 86:1459–1470
Deng Y, Li Y, Sun F, et al. (2021) Unified conversational recommendation policy learning via graph-based reinforcement learning. In: Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. pp 1431–1441
Dulac-Arnold G, Evans R, Hasselt HV, et al. (2015) Deep reinforcement learning in large discrete action spaces. Artificial Intelligence. https://api.semanticscholar.org/CorpusID:13512886
Fu M, Agrawal A, Irissappane AA et al (2021) Deep reinforcement learning framework for category-based item recommendation. IEEE Trans Cybern 52(11):12028–12041
Fu M, Huang L, Rao A et al (2022) A deep reinforcement learning recommender system with multiple policies for recommendations. IEEE Trans Industr Inf 19(2):2049–2061
Giannikis S, Frasincar F, Boekestijn D (2024) Reinforcement learning for addressing the cold-user problem in recommender systems. Knowl-Based Syst 294:111752
Haarnoja T, Zhou A, Abbeel P, et al. (2018a) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning. PMLR, pp 1861–1870
Haarnoja T, Zhou A, Hartikainen K, et al. (2018b) Soft actor-critic algorithms and applications. arXiv:1812.05905
Herlocker JL, Konstan JA, Terveen LG et al (2004) Evaluating collaborative filtering recommender systems. ACM Trans Infor Syst 22(1):5–53
Hessel M, Modayil J, Van Hasselt H, et al. (2018) Rainbow: combining improvements in deep reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence
He K, Zhang X, Ren S, et al. (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision. pp 1026–1034
Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inform Syst 20(4):422–446
Kasirzadeh A, Evans C (2021) User tampering in reinforcement learning recommender systems. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. https://api.semanticscholar.org/CorpusID:237453377
Khurana P, Gupta B, Sharma R et al (2024) Session-aware recommender system using double deep reinforcement learning. J Intell Inform Syst 62(2):403–429
Lei Y, Wang Z, Li W et al (2020) Social attentive deep q-networks for recommender systems. IEEE Trans Knowl Data Eng 34(5):2443–2457
Li D, Li X, Wang J, et al. (2020) Video recommendation with multi-gate mixture of experts soft actor critic. In: Proceedings of the 43rd International ACM SIGIR conference on research and development in information retrieval. pp 1553–1556
Liu D, Yang C (2019) A deep reinforcement learning approach to proactive content pushing and recommendation for mobile users. IEEE Access 7:83120–83136
Liu F, Tang R, Guo H et al (2020) Top-aware reinforcement learning based recommendation. Neurocomputing 417:255–269
Copyright (c) 2025 Dr. Wei Jun Liu, Yiming Chen (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their manuscripts, and all Open Access articles are disseminated under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which licenses unrestricted use, distribution, and reproduction in any medium, provided that the original work is appropriately cited. The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.