nach oben

Erschienen in:

2023 | OriginalPaper | Buchkapitel

Heuristic Search Optimisation Using Planning and Curriculum Learning Techniques

verfasst von : Leah Chrestien, Tomás̆ Pevný, Stefan Edelkamp, Antonín Komenda

Erschienen in: Progress in Artificial Intelligence

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Learning a well-informed heuristic function for hard planning domains is an elusive problem. Although there are known neural network architectures to represent such heuristic knowledge, it is not obvious what concrete information is learned and whether techniques aimed at understanding the structure help in improving the quality of the heuristics. This paper presents a network model that learns a heuristic function capable of relating distant parts of the state space via optimal plan imitation using the attention mechanism which drastically improves the learning of a good heuristic function. The learning of this heuristic function is further improved by the use of curriculum learning, where newly solved problem instances are added to the training set, which, in turn, helps to solve problems of higher complexities and train from harder problem instances. The methodologies used in this paper far exceed the performances of all existing baselines including known deep learning approaches and classical planning heuristics. We demonstrate its effectiveness and success on grid-type PDDL domains, namely Sokoban, maze-with-teleports and sliding tile puzzles.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Data-driven Single Machine Scheduling Minimizing Weighted Number of Tardy Jobs

Nächstes Kapitel Review of Agent-Based Evacuation Models in Python

The composability of harmonic functions is based on the following property \(\cos (\theta _1 + \theta _2) = \cos (\theta _1)\cos (\theta _2) - \sin (\theta _1)\sin (\theta _2) = (\cos (\theta _1), \sin (\theta _1)) \cdot (\sin (\theta _1), \sin (\theta _2)),\) where \(\cdot \) denotes the inner product of two vectors, which appears in Eq. (1) in inner product of \({\textbf {q}}_{u,v}\) and \({\textbf {k}}_{r,s}\).

Convolution layers are appropriately padded to preserve sizes.

https://github.com/ravenkls/Maze-Generator-and-Solver.

https://github.com/levilelis/h-levin/.

https://github.com/YahyaAlaaMassoud/Sliding-Puzzle-A-Star-Solver.

Available at https://github.com/deepmind/boxoban-levels.

https://github.com/deepmind/boxoban-levels/blob/master/unfiltered/test/000.txt.

The planners and NNs were given 10 minutes to solve each maze instance.

Agostinelli, F., McAleer, S., Shmakov, A., Baldi, P.: Solving the rubik’s cube with deep reinforcement learning and search. Nature Mach. Intell. 1(8), 356–363 (2019)CrossRef

Asai, M., Fukunaga, A.: Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. arXiv preprint arXiv:1705.00154 (2017)

Bonet, B., Geffner, H.: Planning as heuristic search. Artif. Intell. 129(1–2), 5–33 (2001)

Elman, J.L.: Learning and development in neural networks: the importance of starting small. Cognition 48(1), 71–99 (1993)CrossRef

Ernandes, M., Gori, M.: Likely-admissible and sub-symbolic heuristics. In: Proceedings of the 16th European Conference on Artificial Intelligence, pp. 613–617 (2004)

Fikes, R.E., Nilsson, N.J.: Strips: a new approach to the application of theorem proving to problem solving. Artif. Intell. 2(3–4), 189–208 (1971)CrossRef

Fox, M., Long, D.: Pddl2. 1: An extension to pddl for expressing temporal planning domains. J. Artif. Intell. Res. 20, 61–124 (2003)

Groshev, E., Goldstein, M., Tamar, A., Srivastava, S., Abbeel, P.: Learning generalized reactive policies using deep neural networks. arXiv:1708.07280 (2017)

Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968)CrossRef

10.

Katz, M., Hoffmann, J.: Mercury planner: Pushing the limits of partial delete relaxation. In: IPC 2014 Planner Abstracts, pp. 43–47 (2014)

11.

Katz, M., Sohrabi, S., Samulowitz, H., Sievers, S.: Delfi: Online planner selection for cost-optimal planning. In: IPC-9 Planner Abstracts, pp. 57–64 (2018)

12.

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)

13.

Long, D., Fox, M.: Automatic synthesis and use of generic types in planning. In: AAAI, pp. 196–205. AAAI Press (2000)

14.

Racanière, S., Weber, T., Reichert, D., Buesing, L., Guez, A., Jimenez Rezende, D., Puigdomènech Badia, A., Vinyals, O., Heess, N., Li, Y., et al.: Imagination-augmented agents for deep reinforcement learning. Adv. Neural. Inf. Process. Syst. 30, 5690–5701 (2017)

15.

Richter, S., Westphal, M.: The lama planner: Guiding cost-based anytime planning with landmarks. J. Artif. Intell. Res. 39, 127–177 (2010)CrossRef

16.

Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 3(6), 233–242 (1999)CrossRef

17.

Schrader, M.P.B.: gym-sokoban. github.com/mpSchrader/gym-sokoban (2018)

18.

Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)

19.

Tesauro, G.: Programming backgammon using self-teaching neural nets. Artif. Intell. 134(1–2), 181–199 (2002)CrossRef

20.

Thrun, S.: Learning to play the game of chess. Adv. Neural. Inf. Process. Syst. 7, 1069–1076 (1994)

21.

Torralba, A., Alcázar, V., Borrajo, D., Kissmann, P., Edelkamp, S.: Symba*: A symbolic bidirectional a* planner. In: International Planning Competition, pp. 105–108 (2014)

22.

Torrey, L., Shavlik, J., Walker, T., Maclin, R.: Skill acquisition via transfer learning and advice taking. In: European Conference on Machine Learning, pp. 425–436. Springer (2006)

23.

Tsai, Y.H.H., Bai, S., Yamada, M., Morency, L.P., Salakhutdinov, R.: Transformer dissection: An unified understanding for transformer’s attention via the lens of kernel. arXiv:1908.11775 (2019)

24.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł, Polosukhin, I.: Attention is all you need. Adv. Neural. Inf. Process. Syst. 30, 5998–6008 (2017)

25.

Virseda, J., Borrajo, D., Alcázar, V.: Learning heuristic functions for cost-based planning. Plan. Learn. 6 (2013)

26.

Yoon, S.W., Fern, A., Givan, R.: Inductive policy selection for first-order mdps. arXiv preprint arXiv:1301.0614 (2012)

Titel: Heuristic Search Optimisation Using Planning and Curriculum Learning Techniques
verfasst von: Leah Chrestien
Tomás̆ Pevný
Stefan Edelkamp
Antonín Komenda
Verlag: Springer Nature Switzerland
Buch: Progress in Artificial Intelligence
Print ISBN: 978-3-031-49007-1

Electronic ISBN: 978-3-031-49008-8

Copyright-Jahr: 2023
DOI: https://doi.org/10.1007/978-3-031-49008-8_39

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner