AI-Guided Policy Learning For Hyperdimensional Sampling: Exploiting Expert Human Demonstrations From Interactive Virtual Reality Molecular Dynamics

Dwi Jatmiko; Huu Nguyen

Open Access

AI-Guided Policy Learning For Hyperdimensional Sampling: Exploiting Expert Human Demonstrations From Interactive Virtual Reality Molecular Dynamics

pdf

Dwi Jatmiko ¹ , Huu Nguyen ¹ ,

⁴ Faculty of Computing and Data Science, University of Indonesia, Depok, Indonesia

⁴ School of Computer Science, Vietnam National University, Hanoi, Vietnam

Abstract

Introduction: Molecular Dynamics (MD) simulations are fundamentally limited by the hyperdimensional sampling problem, which hinders the observation of rare but critical molecular events such as ligand unbinding. Interactive Molecular Dynamics in Virtual Reality (iMD-VR) has emerged as a human-in-the-loop solution, leveraging human spatial intuition to efficiently navigate complex conformational landscapes. This approach generates a unique, high-fidelity dataset of expert human demonstrations.

Methods: This study explores the feasibility of leveraging these iMD-VR datasets to train autonomous Artificial Intelligence (AI) agents using Imitation Learning (IL) strategies. We implemented and evaluated both a basic Behavioral Cloning (BC) approach and a more robust Generative Adversarial Imitation Learning (GAIL) framework, augmented with strategies to mitigate the problem of covariate shift, for the task of guiding a ligand through an unbinding pathway. The state space was carefully engineered to encode the hyperdimensional molecular configuration and the action space defined by the applied force vector.

Results: The GAIL-trained policy demonstrated a significantly higher task success rate compared to the BC model, successfully mimicking the expert’s ability to apply forces that overcome high-energy barriers. Autonomous agent trajectories showed a high fidelity to the expert’s path, successfully exploring pharmacologically relevant conformational space and achieving up to a $65\%$ reduction in the effective energy barrier in tested systems.

Discussion: The findings confirm that IL, particularly advanced methods like GAIL, can effectively translate expert human intuition from a VR environment into robust, autonomous policies for sampling hyperdimensional molecular systems. This AI-guided approach represents a transformative path toward the democratization and acceleration of molecular discovery, with profound implications for computer-aided drug design and materials science by autonomously enabling the exploration of rare-event pathways.

Keywords

Artificial Intelligence, Molecular Dynamics, Virtual Reality, Imitation Learning

References

📄 Saunders WR, Grant J, Müller EH. A domain specific language for performance portable molecular dynamics algorithms. Comput Phys Commun. 2018;224:119–35. https://doi.org/10.1016/j.cpc.2017.11.006.

📄 Walters RK, Gale EM, Barnoud J, Glowacki DR, Mulholland AJ. The emerging potential of interactive virtual reality in drug discovery. Expert Opin Drug Discov. 2022;17:685–98. https://doi.org/10.1080/17460441.2022.2079632.

📄 O’Connor MB, Bennie SJ, Deeks HM, Jamieson-Binnie A, Jones AJ, Shannon RJ, et al. Interactive molecular dynamics in virtual reality from quantum chemistry to drug binding: An open-source multi-person framework. J Chem Phys. 2019;150(22):220901. https://doi.org/10.1063/1.5092590.

📄 Deeks HM, Walters RK, Hare SR, O’Connor MB, Mulholland AJ, Glowacki DR. Interactive molecular dynamics in virtual reality for accurate flexible protein-ligand docking. PLoS One. 2020;15(3):1–21. https://doi.org/10.1371/journal.pone.0228461.

📄 Deeks HM, Walters RK, Barnoud J, Glowacki D, Mulholland A. Interactive molecular dynamics in virtual reality is an effective tool for flexible substrate and inhibitor docking to the SARS-CoV-2 main protease. J Chem Inf Model. 2020. https://doi.org/10.1021/acs.jcim.0c01030.

📄 Deeks HM, Zinovjev K, Barnoud J, Mulholland AJ, van der Kamp MW, Glowacki DR. Free energy along drug-protein binding pathways interactively sampled in virtual reality. Sci Rep. 2023;13:16665. https://doi.org/10.1038/s41598-023-43523-x.

📄 Shannon RJ, Deeks HM, Burfoot E, Clark E, Jones AJ, Mulholland AJ, et al. Exploring human-guided strategies for reaction network exploration: interactive molecular dynamics in virtual reality as a tool for citizen scientists. J Chem Phys. 2021;155:154106. https://doi.org/10.1063/5.0062517.

📄 Zheng B, Verma S, Zhou J, Tsang IW, Chen F. Imitation learning: Progress, taxonomies and challenges. IEEE Trans Neural Netwo Learn Syst. 2024;35(5):6322–37. https://doi.org/10.1109/TNNLS.2022.3213246.

📄 Samantapudi, R. K. R. (2025). Enhancing search and recommendation personalization through user modeling and representation. International Journal of Computational and Experimental Science and Engineering, 11(3), 6246–6265. https://doi.org/10.22399/ijcesen.3784

📄 Hussein A, Gaber MM, Elyan E, Jayne C. Imitation learning: a survey of learning methods. ACM Comput Surv. 2017. https://doi.org/10.1145/3054912.

📄 Gavenski N, Meneguzzi F, Luck M, Rodrigues O. A survey of imitation learning methods, environments and metrics 2024. arXiv:2404.19456.

📄 Schaal S. Is imitation learning the route to humanoid robots? Trends Cogn Sci. 1999;3:233–42. https://doi.org/10.1016/S1364-6613(99)01327-3.

📄 Reggia JA, Katz GE, Davis GP. Humanoid cognitive robots that learn by imitating: implications for consciousness studies. Front Robot AI. 2018;5:1. https://doi.org/10.3389/frobt.2018.00001.

📄 Hua J, Zeng L, Li G, Ju Z. Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors. 2021;21:1278. https://doi.org/10.3390/s21041278.

📄 Zare M, Kebria PM, Khosravi A, Nahavandi S. A survey of imitation learning: algorithms, recent developments, and challenges. IEEE Trans Cybern. 2023. https://doi.org/10.1109/TCYB.2024.3395626.

📄 Leinen P, Esders M, Schütt KT, Wagner C, Müller KR, Tautz FS. Autonomous robotic nanofabrication with reinforcement learning. Sci Adv. 2020;6(36):eabb6987. https://doi.org/10.1126/sciadv.abb6987.

📄 Ai C, Yang H, Liu X, Dong R, Ding Y, Guo F. Mtmol-gpt: de novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS Comput Biol. 2024;20(6):1–23. https://doi.org/10.1371/journal.pcbi.1012229.

📄 Chadha, K. S. (2025). Zero-Trust Data Architecture for Multi-Hospital Research: HIPAA-Compliant Unification of EHRs, Wearable Streams, and Clinical Trial Analytics. International Journal of Computational and Experimental Science and Engineering, 11(3). https://doi.org/10.22399/ijcesen.3477

📄 Jia X, Blessing D, Jiang X, Reuss M, Donat A, Lioutikov R, Neumann G. Towards diverse behaviors: a benchmark for imitation learning with human demonstrations. In: The Twelfth International Conference on Learning Representations. 2024. arXiv:2402.14606.

📄 Maadi M, Akbarzadeh Khorshidi H, Aickelin U. A review on human–AI interaction in machine learning and insights for medical applications. Int J Environ Res Public Health. 2021;18(4):2121. https://doi.org/10.3390/ijerph18042121.

📄 Webb ME, Fluck A, Magenheim J, Malyn-Smith J, Waters J, Deschênes M, et al. Machine learning for human learners: opportunities, issues, tensions and threats. Educ Technol Res Dev. 2021;69:2109–30. https://doi.org/10.1007/s11423-020-09858-2.

📄 Jung E, Kim I. Hybrid imitation learning framework for robotic manipulation tasks. Sensors. 2021. https://doi.org/10.3390/s21103409.

📄 Seritan S, Wang Y, Ford JE, Valentini A, Gold T, Martínez TJ. Interachem: virtual reality visualizer for reactive interactive molecular dynamics. J Chem Educ. 2021. https://doi.org/10.1021/acs.jchemed.1c00654.

📄 Doutreligne S, Gageat C, Cragnolini T, Taly A, Pasquali S, Derreumaux P, Baaden M. UnityMol: interactive and ludic visual manipulation of coarse-grained RNA and other biomolecules. In: 2015 IEEE 1st international workshop on virtual and augmented reality for molecular science (VARMS@IEEEVR); 2015. p. 1–6. https://doi.org/10.1109/VARMS.2015.7151718.

📄 Bennie SJ, Maritan M, Gast J, Loschen M, Gruffat D, Bartolotta R, Hessenauer S, Leija E, McCloskey S. A virtual and mixed reality platform for molecular design and drug discovery—Nanome Version 1.24. In: Byška J, Krone M, Sommer B editors. Workshop on molecular graphics and visual analysis of molecular data. The Eurographics Association; 2023. https://doi.org/10.2312/molva.20231114.

📄 Kneller DW, Li H, Galanie S, Phillips G, Labbé A, Weiss KL, et al. Structural, electronic, and electrostatic determinants for inhibitor binding to subsites s1 and s2 in sars-cov-2 main protease. J Med Chem. 2021;64:17366–83. https://doi.org/10.1021/ACS.JMEDCHEM.1C01475

📄 Cassidy KC, Šefčík J, Raghav Y, Chang A, Durrant JD. Proteinvr: web-based molecular visualization in virtual reality. PLoS Comput Biol. 2020;16(3):1–17. https://doi.org/10.1371/journal.pcbi.1007747

📄 Norrby M, Grebner C, Eriksson J, Boström J. Molecular rift: virtual reality for drug designers. J Chem Inf Model. 2015;55(11):2475–84. https://doi.org/10.1021/acs.jcim.5b00544

📄 Crossley-Lewis J, Dunn J, Buda C, Sunley GJ, Elena AM, Todorov IT, et al. Interactive molecular dynamics in virtual reality for modelling materials and catalysts. J Mol Graph Model. 2023;125:108606. https://doi.org/10.1016/j.jmgm.2023.108606.

📄 Srilatha, S. (2025). Integrating AI into enterprise content management systems: A roadmap for intelligent automation. Journal of Information Systems Engineering and Management, 10(45s), 672–688. https://doi.org/10.52783/jisem.v10i45s.8904

📄 Stroud HJ, Wonnacott MD, Barnoud J, Roebuck Williams R, Dhouioui M, McSloy A, et al. NanoVer server: a python package for serving real-time multi-user interactive molecular dynamics in virtual reality. J Open Sour Softw. 2025;10(110):8118. https://doi.org/10.21105/joss.08118.

📄 Jamieson-Binnie AD, O’Connor MB, Barnoud J, Wonnacott MD, Bennie SJ, Glowacki DR. Narupa iMD: a VR-enabled multiplayer framework for streaming interactive molecular simulations. In: ACM SIGGRAPH 2020 immersive pavilion, SIGGRAPH ’20. New York: Association for Computing Machinery; 2020. https://doi.org/10.1145/3388536.3407891.

📄 Gowers RJ, Linke M, Barnoud J, Reddy TJE, Melo MN, Seyler SL, Domański J, Dotson, Sébastien Buchoux DL, Kenney IM, Beckstein O. MDAnalysis: a python package for the rapid analysis of molecular dynamics simulations. In: Sebastian B, Scott R editors. Proceedings of the 15th Python in science conference. 2016. p. 98–105. https://doi.org/10.25080/Majora-629e541a-00e.

📄 Michaud-Agrawal N, Denning EJ, Woolf TB, Beckstein O. Mdanalysis: a toolkit for the analysis of molecular dynamics simulations. J Comput Chem. 2011;32(10):2319–27. https://doi.org/10.1002/jcc.21787.

📄 Kalra A, Hummer G, Garde S. Methane partitioning and transport in hydrated carbon nanotubes. J Phys Chem B. 2004;108(2):544–9. https://doi.org/10.1021/jp035828x.

📄 Correia A, Alexandre LA. A survey of demonstration learning. Robot Auton Syst. 2024;182:104812. https://doi.org/10.1016/j.robot.2024.104812.

📄 Bui TV, Mai TA, Nguyen TH. Inverse factorized soft Q-learning for cooperative multi-agent imitation learning. In: The thirty-eighth annual conference on neural information processing systems. 2024. https://openreview.net/forum?id=xrbgXJomJp.

📄 Bui TV, Mai T, Nguyen TH. Inverse factorized q-learning for cooperative multi-agent imitation learning. 2023. arXiv:2310.06801.

📄 Ellis B, Cook J, Moalla S, Samvelyan M, Sun M, Mahajan A, Foerster JN, Whiteson S. SMACv2: an improved benchmark for cooperative multi-agent reinforcement learning. In: Proceedings of the 37th international conference on neural information processing systems, NIPS ’23. Red Hook: Curran Associates Inc.; 2024. https://doi.org/10.5555/3666122.3667756.

📄 Rangu, S. (2025). Analyzing the impact of AI-powered call center automation on operational efficiency in healthcare. Journal of Information Systems Engineering and Management, 10(45s), 666–689. https://doi.org/10.55278/jisem.2025.10.45s.666

📄 FPT. Fpt reinforcement learning competition. 2020. https://codelearn.io/game/detail/2212875#ai-game-summary.

📄 Paine TL, Gulcehre C, Shahriari B, Denil M, Hoffman M, Soyer H, Tanburn R, Kapturowski S, Rabinowitz N, Williams D, et al. Scaling laws for imitation learning in single-agent games. 2023. arXiv preprint arXiv:2301.13314.

📄 Küttler H, Nardelli N, Miller AH, Raileanu R, Selvatici M, Grefenstette E, Rocktäschel T. The nethack learning environment. CoRR. 2020. arXiv:2006.13760.

📄 Younes M, Kijak E, Kulpa R, Malinowski S, Multon F. Maaip: multi-agent adversarial interaction priors for imitation from fighting demonstrations for physics-based characters. Proc ACM Comput Graph Interact Tech. 2023. https://doi.org/10.1145/3606926.

📄 Ravichandar H, Polydoros AS, Chernova S, Billard A. Recent advances in robot learning from demonstration. Ann Rev Control Robot Auton Syst. 2020;3(1):297–330. https://doi.org/10.1146/annurev-control-100819-063206.

📄 Zhu Y, Joshi A, Stone P, Zhu Y. Viola: imitation learning for vision-based manipulation with object proposal priors. Proc Mach Learn Res. 2022;205:1199–210. https://doi.org/10.48550/arXiv.2210.11339.

📄 Seo M, Gupta R, Zhu Y, Skoutnev A, Sentis L, Zhu Y. Learning to walk by steering: perceptive quadrupedal locomotion in dynamic environments. In: 2023 IEEE international conference on robotics and automation (ICRA). 2023. p. 5099–105. https://doi.org/10.1109/ICRA48891.2023.10161302.

📄 Mehta SA, Losey DP. Unified learning from demonstrations, corrections, and preferences during physical human–robot interaction. J Hum Robot Interact. 2023. https://doi.org/10.1145/3623384.

📄 Pomerleau DA. Efficient training of artificial neural networks for autonomous navigation. Neural Comput. 1991;3:88–97. https://doi.org/10.1162/NECO.1991.3.1.88.

📄 Sammut C. Behavioral cloning. Encycl Mach Learn. 2011. https://doi.org/10.1007/978-0-387-30164-8_69.

📄 Russell S. Learning agents for uncertain environments (extended abstract). In: Proceedings of the eleventh annual conference on computational learning theory, COLT’ 98. New York: Association for Computing Machinery; 1998. p. 101–03. https://doi.org/10.1145/279943.279964.

📄 Ng AY, Russell SJ. Algorithms for inverse reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning, ICML ’00. San Francisco: Morgan Kaufmann Publishers Inc.; 2000. p. 663–70. https://doi.org/10.5555/645529.657801.

📄 Ziebart BD, Maas A, Bagnell JA, Dey AK. Maximum entropy inverse reinforcement learning. In: Proceedings of the 23rd National conference on artificial intelligence—volume 3, AAAI’08. AAAI Press; 2008. p. 1433–38. https://doi.org/10.5555/1625275.1625692.

📄 Ramachandran D, Amir E. Bayesian inverse reinforcement learning. In: Proceedings of the 20th international joint conference on artifical intelligence, IJCAI’07. San Francisco: Morgan Kaufmann Publishers Inc.; 2007. p. 2586–91. https://doi.org/10.5555/1625275.1625692.

📄 Metelli AM, Ramponi G, Concetti A, Restelli M. Provably efficient learning of transferable rewards. In: Meila M, Zhang T editors. Proceedings of the 38th international conference on machine learning, proceedings of machine learning research, vol. 139. PMLR; 2021. p. 7665–76. https://proceedings.mlr.press/v139/metelli21a.html.

📄 Deka A, Liu C, Sycara KP. ARC—Actor residual critic for adversarial imitation learning. In: Liu K, Kulic D, Ichnowski J, editors. Proceedings of the 6th conference on robot learning, proceedings of machine learning research, vol. 205. PMLR; 2023. p. 1446–56. https://proceedings.mlr.press/v205/deka23a.html.

📄 Ho J, Ermon S. Generative adversarial imitation learning. In: Proceedings of the 30th international conference on neural information processing systems, NIPS’16. Red Hook: Curran Associates Inc.; 2016. p. 4572–80. https://doi.org/10.5555/3157382.3157608.

📄 Schulman J, Levine S, Abbeel P, Jordan M, Moritz P. Trust region policy optimization. In: Bach F, Blei D editors. Proceedings of the 32nd international conference on machine learning, proceedings of machine learning research, vol. 37. Lille: PMLR; 2015. p. 1889–97. https://proceedings.mlr.press/v37/schulman15.html.

📄 Pomerleau DA. ALVINN: an autonomous land vehicle in a neural network. In: Proceedings of the 2nd international conference on neural information processing systems, NIPS’88. Cambridge: MIT Press; 1988. p. 305–13. https://doi.org/10.5555/2969735.2969771.

📄 de Haan P, Jayaraman D, Levine S. Causal confusion in imitation learning. In: Proceedings of the 33rd international conference on neural information processing systems. Red Hook: Curran Associates Inc.; 2019.https://doi.org/10.5555/3454287.3455336

International Journal of Advanced Artificial Intelligence Research

AI-Guided Policy Learning For Hyperdimensional Sampling: Exploiting Expert Human Demonstrations From Interactive Virtual Reality Molecular Dynamics

Abstract

Keywords

References

Similar Articles