Parallel Strategies for Best-First Generalized Planning
We applied parallel search techniques to Best-First Generalized Planning (BFGP), a Generalized Planning algorithm based on heuristic search.

This was my Bachelor’s Thesis project, developed at the School of Engineering of Universitat Pompeu Fabra. The project was tutored by Javier Segovia-Aguas, from the Artificial Intelligence and Machine Learning Research Group. If you are interested in the project, you can also check the full thesis document. Below, you can find a copy of the paper presented in the Highlights of Reasoning about Actions, Planning and Reactive Synthesis workshop.
Abstract
In recent years, there has been renewed interest in closing the performance gap between state-of-the-art planning solvers and generalized planning (GP), a research area of AI that studies the automated synthesis of algorithmic-like solutions capable of solving multiple classical planning instances. One of the current advancements has been the introduction of Best-First Generalized Planning (BFGP), a GP algorithm based on a novel solution space that can be explored with heuristic search, one of the foundations of modern planners. This paper evaluates the application of parallel search techniques to BFGP, another critical component in closing the performance gap. We first discuss why BFGP is well suited for parallelization and some of its differentiating characteristics from classical planners. Then, we propose two simple shared-memory parallel strategies with good scaling with the number of cores.
Introduction
Generalized planning (GP) has been a longstanding area of research in Artificial Intelligence (AI) (Jiménez et al., 2019; Srivastava et al., 2011). The foundation of GP is automated planning, which studies how to construct sequences of actions (commonly known as plans) to go from a specific initial state to a goal (Ghallab et al., 2016). Since planning is a hard problem (PSPACE-complete) (Baz et al., 2021), solving multiple problem instances from the same domain is computationally expensive. Given this realization, GP studies the synthesis of general plans that can solve multiple problem instances from the same domain, reducing the computational complexity to a one-time up-front cost (Hu & De Giacomo, 2011; Srivastava, 2011).
The state-of-the-art planning solvers are often heuristic-based planners (Taitler et al., 2024). These planners guide the combinatorial search of reaching a goal state from the initial state with heuristics, usually based on computing the cost of solving a relaxed plan as an estimate of the actual cost of reaching the solution (Bonet & Geffner, 2001). Given the success of heuristic search in planning, Segovia-Aguas et al. (Segovia-Aguas et al., 2021, 2024) propose a heuristic-based approach to generalized planning, which they call Best-First Generalized Planning (BFGP). BFGP leverages a set of novel GP native heuristics and a new solution space, independent of the number of input instances, to compute general algorithmic solutions.
Parallel programming is tightly coupled with AI’s recent success, as CPU manufacturers have transitioned to multi-core processors due to single-core performance stagnation (Kishimoto et al., 2009). As a consequence, there has been a notable effort to parallelize fundamental algorithms like Best-First Search (BFS), and these techniques have already been successfully applied to planning domains (Baz et al., 2021; Kishimoto et al., 2009; Kuroiwa & Fukunaga, 2021). One of the most impactful algorithms has been Hash Distributed A* (HDA*) (Kishimoto et al., 2009), which efficiently handles the two most challenging communication overheads of Parallel BFS: distributing search nodes and keeping a consistent closed list of reached nodes. To achieve this, they use a hash function to assign each state to a unique process (which allows local duplicate detection) and asynchronous communication.
While using parallel techniques for classical planners is an active research topic, applications to generalized planning are much less explored. In this paper, we discuss how the BFGP algorithm can be easily parallelized by design. Furthermore, we present two shared-memory parallelization strategies that can scale linearly with the number of cores.
Suitness of Best-First Generalized Planning for parallelization
A classical planning problem (Haslum et al., 2019) is defined as
GP is formalized as the problem of finding an algorithm-like solution
-
A planning action
, where is the set of deterministic functions of the planning domain. -
A goto instruction
, where is a program line such that and , and is a proposition. Proposition can be the result of an arbitrary expression on state variables. -
A termination instruction
. The last instruction of a planning program is always a termination instruction.
A planning program
In this work, we use the generalized planner BFGP (Segovia-Aguas et al., 2024), which has shown good performance for computing planning programs via heuristic search. BFGP is a frontier-search algorithm (Korf et al., 2005), which means that it does not repeat states during the search (i.e., all expanded planning programs are different). This unique characteristic eliminates the need to maintain a closed list of states to prevent search loops. Notably, this simplifies the algorithm’s parallelization, as duplicate detection is a significant source of synchronization and communication overheads in parallel BFS.
Another relevant detail about BFGP is that it is a Greedy Best-First Search (GBFS). Unlike A*, the GBFS is guided by an evaluation function
Parallel Best-First Generalized Planning
The following section presents our two strategies for parallelizing BFGP1 and evaluates their performance in
Parallel strategy #1. It sequentially expands nodes until there are at least
Domain | Execution time | ||||
---|---|---|---|---|---|
Corridor | 1212.37 | 619.8 | 296.2 | 15.24 | 12.37 |
Fibonacci | 0.46 | 0.24 | 0.13 | 0.08 | 0.03 |
Find | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Gripper | 12.92 | 6.72 | 3.16 | 1.28 | 0.59 |
Reverse | 0.08 | 0.04 | 0.02 | 0.01 | 0.01 |
Select | 0.02 | 0.02 | 0.02 | 0.01 | 0.01 |
Sorting | 0.01 | 0.01 | 0.01 | 0.01 | 0.01 |
Triangular Sum | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Visitall | 338.14 | 227.85 | 157.91 | 114.91 | 79.48 |
Parallel strategy #2. In this strategy, threads distribute promising nodes during the parallel search phase, so there is a tradeoff between searching the most promising states and minimizing the communication overhead. In our solution, we compute the cost-to-go value of each generated node, and if it is equal to or better than the last expanded node, we send the new node to another thread. In contrast to HDA*, where a hash function determines which process will receive the node, we simply cycle between all threads. This approach is viable because BFGP does not need to perform duplicate detection. Table 2 evaluates the scaling of this strategy.3
Domain | Execution time | ||
---|---|---|---|
Corridor | 768.89 | 442.1 | 40.81 |
Fibonacci | 0.13 | 0.1 | 0.05 |
Find | 0.0 | 0.0 | 0.0 |
Gripper | 5.7 | 2.88 | 3.96 |
Reverse | 0.04 | 0.02 | 0.02 |
Select | 0.02 | 0.02 | 0.01 |
Sorting | 0.01 | 0.01 | 0.01 |
Triangular Sum | 0.0 | 0.0 | 0.0 |
Visitall | 118.82 | 113.16 | 20.84 |
Discussion
The first strategy performs very well, with speedups ranging from ~4x to ~98x in the most complex domains. Furthermore, increasing the number of threads always results in better performance. On the other hand, no parallel strategy strictly dominates. In some domains (like Visitall), the second strategy gets better scaling and performance than the first strategy, but in others, it gets slower execution times. We believe that a better prioritization of promising nodes and the use of asynchronous communication would help the second strategy perform better than the first one. To conclude, our results show that BFGP is well-suited for parallelization, and further developments could make BFGP capable of handling more complex problems from IPC planning domains (Taitler et al., 2024).