INVESTIGATING DATA GENERATION STRATEGIES FOR LEARNING HEURISTIC FUNCTIONS IN CLASSICAL PLANNING
DOI:
https://doi.org/10.55640/ijaair-v02i04-01Keywords:
Classical planning, heuristic functions, data generation strategies, machine learningAbstract
In classical planning, the efficiency and effectiveness of heuristic functions are crucial for guiding search algorithms toward optimal solutions. This study investigates various data generation strategies for training machine learning models to learn heuristic functions in classical planning domains. By comparing approaches such as random sampling, goal-directed sampling, and domain-specific guided data collection, the research evaluates their impact on the accuracy and generalizability of learned heuristics. Experimental results across benchmark planning problems reveal that the choice of data generation strategy significantly influences the performance of the resulting heuristics. The study provides insights into the trade-offs between data diversity, representativeness, and computational efficiency, contributing to the development of more robust learning-based planning systems.
References
Agostinelli, F., McAleer, S., Shmakov, A., & Baldi, P. (2019). Solving the Rubik’s cube with deep reinforcement learning and search. Nature Machine Intelligence, 1(8), 356–363.
Alcazar, V., Borrajo, D., Fernandez, S., & Fuentetaja, R. (2013). Revisiting regression in planning. International Joint Conference on Artificial Intelligence, 2254–2260.
Arfaee, S. J., Zilles, S., & Holte, R. C. (2011). Learning heuristic functions for large state spaces. Artificial Intelligence, 175(16-17), 2075–2098.
Bäckström, C., & Nebel, B. (1995). Complexity results for SAS+ planning. Computational Intelligence, 11(4), 625–655.
Bonet, B. (2013). An Admissible Heuristic for SAS+ Planning Obtained from the State Equation. International Joint Conference on Artificial Intelligence, 2268–2274.
Bonet, B., & Geffner, H. (2001). Planning as heuristic search. Artificial Intelligence, 129(1-2), 5–33.
Culberson, J. C., & Schaeffer, J. (1998). Pattern databases. Computational Intelligence, 14(3), 318–334.
Dong, H., Mao, J., Lin, T., Wang, C., Li, L., & Zhou, D. (2019). Neural logic machines [Poster]. International Conference on Learning Representations.
Doran, J. E., & Michie, D. (1966). Experiments with the Graph Traverser Program. Proceedings of the Royal Society A, 294, 235–259.
Edelkamp, S. (2001). Planning with pattern databases. European Conference on Planning, 13–24.
Ferber, P., Geißer, F., Trevizan, F., Helmert, M., & Hoffmann, J. (2022). Neural network heuristic functions for classical planning: Bootstrapping and comparison to other methods. International Conference on Automated Planning and Scheduling, 583–587.
Ferber, P., Helmert, M., & Hoffmann, J. (2020). Neural network heuristics for classical planning: A study of hyperparameter space. European Conference on Artificial Intelligence, 325, 2346–2353.
Frances, G., Ramírez Jávega, M., Lipovetzky, N., & Geffner, H. (2017). Purely declarative action descriptions are overrated: Classical planning with simulators. International Joint Conference on Artificial Intelligence, 4294–4301.
Francés, G., Ramirez, M., & Collaborators. (2018). Tarski: An AI planning modeling framework [GitHub: https://github.com/aig-upf/tarskio].
Garrett, C. R., Kaelbling, L. P., & Lozano-Pérez, T. (2016). Learning to rank for synthesizing planning heuristics. International Joint Conferences on Artificial Intelligence, 3089–3095.
Gehring, C., Asai, M., Chitnis, R., Silver, T., Kaelbling, L. P., Sohrabi, S., & Katz, M. (2022). Reinforcement learning for classical planning: Viewing heuristics as dense reward generators. International Conference on Automated Planning and Scheduling, 32, 588–596.
Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. IEEE International Joint Conference on Neural Networks, 2(2005), 729–734.
Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107.
Haslum, P., Botea, A., Helmert, M., Bonet, B., Koenig, S., et al. (2007). Domain-independent construction of pattern database heuristics for cost-optimal planning. AAAI Conference on Artificial Intelligence, 1007–1012.
Haslum, P., & Geffner, H. (2000). Admissible Heuristics for Optimal Planning. International Conference on Artificial Intelligence Planning Systems, 140–149.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. International Conference on Computer Vision, 1026–1034.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
Helmert, M. (2006). The Fast Downward planning system. Journal of Artificial Intelligence Research, 26, 191–246.
Helmert, M. (2009). Concise finite-domain representations for PDDL planning tasks. Artificial Intelligence, 173(5-6), 503–535.
Helmert, M., & Domshlak, C. (2009). Landmarks, critical paths and abstractions: What’s the difference anyway? International Conference on Automated Planning and Scheduling, 162–169.
Helmert, M., Haslum, P., Hoffmann, J., et al. (2007). Flexible abstraction heuristics for optimal sequential planning. International Conference on Automated Planning and Scheduling, 176–183.
Hoffmann, J., & Nebel, B. (2001). The FF planning system: Fast plan generation through heuristic search. Journal of Artificial Intelligence Research, 14, 253–302.
Hoffmann, J., Porteous, J., & Sebastia, L. (2004). Ordered Landmarks in Planning. Journal of Artificial Intelligence Research, 22, 215–278.
Hollander, M., Wolfe, D. A., & Chicken, E. (2014). Nonparametric statistical methods (3rd ed.). Wiley.
Holte, R. C. (2010). Common misconceptions concerning heuristic search. Symposium on Combinatorial Search, 46–51.
Karpas, E., & Domshlak, C. (2009). Cost-optimal planning with landmarks. International Joint Conference on Artificial Intelligence, 1728–1733.
Kingma, D., & Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations.
Lu, L., Shin, Y., Su, Y., & Karniadakis, G. E. (2020). Dying ReLU and initialization: Theory and numerical examples. Communications in Computational Physics, 28(5), 1671–1706.
O’Toole, S., Ramirez, M., Lipovetzky, N., & Pearce, A. R. (2022). Sampling from pre-images to learn heuristic functions for classical planning. Symposium on Combinatorial Search, 15(1), 308–310.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., . . . Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances In Neural Information Processing Systems, 32, 8024–8035.
Rivlin, O., Hazan, T., & Karpas, E. (2020). Generalized planning with deep reinforcement learning. arXiv: 2005.02305 [cs.AI].
Samadi, M., Felner, A., & Schaeffer, J. (2008). Learning from Multiple Heuristics. AAAI Conference on Artificial Intelligence, 357–362.
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2008). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61–80.
Shen, W., Trevizan, F., & Thiebaux, S. (2020). Learning domain-independent planning heuristics with hypergraph networks. International Conference on Automated Planning and Scheduling, 574–584.
Ståhlberg, S., Bonet, B., & Geffner, H. (2022). Learning general optimal policies with graph neural networks: Expressive power, transparency, and limits. International Conference on Automated Planning and Scheduling, 32, 629–637.
Sturtevant, N., & Helmert, M. (2019). Exponential-binary state-space search. arXiv: 1906.02912 [cs.AI].
Toyer, S., Thiebaux, S., Trevizan, F., & Xie, L. (2020). ASNets: Deep learning for generalised planning. Journal of Artificial Intelligence Research, 68, 1–68.
Toyer, S., Trevizan, F., Thiebaux, S., & Xie, L. (2018). Action schema networks: Generalised policies with deep learning. AAAI Conference on Artificial Intelligence, 6294–6301.
van Den Briel, M., Benton, J., Kambhampati, S., & Vossen, T. (2007). An LP-based heuristic for optimal planning. Principles and Practice of Constraint Programming, 651–665.
Yu, L., Kuroiwa, R., & Fukunaga, A. (2020). Learning search-space specific heuristics using neural network. ICAPS Workshop on Heuristics and Search for Domain-independent Planning.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Elena Volkova, Emily Smith (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their manuscripts, and all Open Access articles are disseminated under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which licenses unrestricted use, distribution, and reproduction in any medium, provided that the original work is appropriately cited. The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.