Learning High-Level Planning from Text

 S.R.K. Branavan, Nate Kushman, Tao Lei, Regina Barzilay



Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is grounding language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline's 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline as measured by completed plans - successfully completing 80% as compared to 69% for the baseline.

Experimental Framework

Our method performs high-level planning guided by information extracted from text documents. These high-level plans are then converted to executable low-level plans using a standard classical planning algorithm — specifically, using Metric-FF. Depending on the characteristics of the low-level planning task, the computation time required by Metric-FF can be long. To keep experiment run-times manageble, we employ a distributed farm of Metric-FF instances, and cache the low-level plans returned by Metric-FF to avoid redundant recomputations. The following diagram shows the experimental framework — the individual components of which are described below:

Language-Aware Planner

This is our language-aware high-level planning algorithm.

Feature Computation

This is simply a pre-processing step which uses a simple text-overlap heuristic to identify any sentences that might mention multiple objects in the world, and converts the text of the sentence into the feature representation used by our method. Our model operates on this representation to identify valid precondition relations between objects as described in the text.

FF-Plan Cache

The FF-Plan Cache serves as the interface between the high-level planning algorithms and the low-level planner (Metric-FF). In this role, it performs three specific functions. First, it is caches each planning problem sent to Metric-FF, and the correponding output produced by Metric-FF. Subsequent calls to the cache with the same planning problem thus do not require the execution of Metric-FF. Second, it allows the high-level planner to test multiple low-level plans in parallel by distributing the low-level planning tasks to many instances of Metric-FF running on several machines. Third, the cache allows multiple high-level planners to simultaneously operate off a single pool of low-level planners. These functions significantly reduce experiment run-times.

Cache Client

This script manages the execution of Metric-FF on given low-level planning tasks.


  1. Complete archive
This archive contains all of the code and configurations
packaged for ease of compilation, with makefile include/library
paths set as necessary. Data and annotation files are also included.
please refer to the
readme file in the archive for further details.
     [ code & data ]
  2. Sample FF plan cache   This is a large (1GB) file.
This archive contains a sample cache of FF plan responses that can
be used with
FF-plan cache. Plans for which cached responses exist
don't require Metric-FF to be executed. Thus this cache allows for
quick (and approximate) experimentation without the resource
overheads of running Metric-FF. Please refer to this
readme file
for an explanation of how to run the model in cache-only mode.
Note that model performance in this mode is only
indicative of real performance, and should not be considered to be
[ plan cache ]


The planning task definitions used in the experiments are available in PDDL format from the link below:

       Planning task definitions