Learning to Win by Reading Manuals in a Monte-Carlo Framework

 S.R.K. Branavan, David Silver, Regina Barzilay

 Paper    Slides  Video


This paper presents a novel approach for leveraging automatically extracted textual knowledge to improve the performance of control applications such as games. Our ultimate goal is to enrich a stochastic player with high-level guidance expressed in text. Our model jointly learns to identify text that is relevant to a given game state in addition to learning game strategies guided by the selected text. Our method operates in the Monte-Carlo search framework, and learns both text analysis and game strategies based only on environment feedback. We apply our approach to the complex strategy game Civilization II using the official game manual as the text guide. Our results show that a linguistically-informed game-playing agent significantly outperforms its language-unaware counterpart, yielding a 27% absolute improvement and winning over 78% of games when playing against the built-in AI of Civilization II.

This work was supported by the NSF.

Experimental Framework

Since our method operates in the Monte-Carlo search framework, in addition to the game it is directly controlling, it also requires the ability to play "simulated games". In our experiments, these simulations are facilitated by running multiple instances of the FreeCiv game. One of these instances is used to play the actual game, while the rest are used for the simulated game play. Figure 1 shows the entire experimental framework, and the individual components of the framework are described below.

Figure 1. This diagram shows the complete framework used in our experiments. The green box labelled "Monte-Carlo Player" represents the various algorithms and baseline methods presented in the paper. Due to the software architecture of the FreeCiv game code, each game instance is actually comprised of a pair of processes - a game server and a Game GUI client. One game instance is used as the primary game which the Monte-Carlo Player attempts to play and win. Eight other game instances are used for simulated game play. All of these processes including the Monte-Carlo player are started, monitored and stopped by a Experiment Manager process.

Game Server and GUI Client

The Game Server is the standard game server from the version 2.2 FreeCiv code distribution (i.e. freeciv-server binary). The code of the game server was modified to reduce CPU and memory usage, and to fix bugs which caused the game to crash. However, to the best of our knowledge, these changes do not affect the functionality of the game server, nor do they change the game rules.

The Game GUI Client is also the standard Gtk GUI client from the version 2.2 FreeCiv code distribution. The client code was significantly modified to remove all GUI elements (again for performance reasons only), and to allow the client to be remotely controlled via a socket connection. This remote-control functionality allows an external process (e.g., the Monte-Carlo Player) to access the information presented to human players on the game GUI, and to execute game actions in place of a human player.

The minimal requirement to test the Monte-Carlo Player is one primary game, and one simulation. However, the majority of experimental time is taken up by simulated game play. Therefore, running multiple game simulations in parallel significantly reduces the wall-clock time needed to run an experiment. While running more game simulations reduces wall-clock time for simulations, it also increases the multi-threading overhead in the Monte-Carlo Player. In our experiments, we found eight game simulations to be the most time-efficient configuration on a core-i7 CPU with four hyper-threaded cores.

Communications between the Monte-Carlo Player, the Game GUI Client and the Game Server are effected via socket connections. While the code of all three processes supports both TCP/IP sockets and UNIX sockets, the default configuration for the framework is to use UNIX sockets for performance reasons.

Monte-Carlo Player

This simply represents one of the many game-playing algorithms and baseline systems described in the paper. Essentially, in any particular experiment, the algorithm that we wish to test takes the place of the Monte-Carlo Player.

In-memory File System

In the general Monte-Carlo search framework, game actions are selected by playing simulated games starting from the current actual game state. This requires the simulations to be initialized with the current state of the actual game - i.e., we need a way to transfer the current state of the primary game to the simulations. In our experiment framework, this is achieved via the game save/load functionality of FreeCiv - essentially current state is saved from the primary game via a save-game file, which is then loaded into the simulations. This save-game file is written to an in-memory file system for performance reasons. We use GNU/Linux's built-in in-memory file system /dev/shm for this purpose.

Experiment Manager

A typical experiment using eight game simulations consists of 18 FreeCiv processes and the algorithm under test, making manual management of experiments cumbersome. The Experiment Manager is a simple process which simplifies the scripting and management of sequences of experiments. Except during debugging, the Experiment Manager is the recommended way to run experiments.


  1.Complete code archive
This archive contains all of the code and
configurations listed below packaged for
ease of compilation, with makefile
include/library paths set as necessary.
Data and annotation files are also included.
please refer to the
readme file in the
archive for further details.

[ code ]
[Updated 2013/05/10]
  2.Complete runtime archive
This archive contains a complete runtime environment.
Please refer to the
readme file in the
archive for further details.

[ runtime ]
[Updated 2013/05/10]
  3.Virtual machine setup
This archive contains a pre-setup runtime environment
in a Vmware virtual machine. Please refer to the
file in the archive for further details.

NOTE: This is a large (1.1GB) file.
[ virtual machine ]


The datasets used in this work are available in text format from the link below:

       Civilization II game manual text.