Global Models of Document Structure Using Latent Permutations

 Harr Chen, S.R.K. Branavan, Regina Barzilay, David R. Karger.

Code

The source code for this work can be downloaded from the link below. This code has been tested on both Windows and Linux 64bit environments. While the code has to be run from Matlab, for performance reasons the majority of the functionality is written in C. Therefore, as a first step, sample_model.c needs to be compiled from Matlab using the command:

        mex -largeArrayDims sample_model.c

Model inference can then be performed using the command run_inference() the parameters of which are explained in run_inference.m.

[ Source code for model ]

Data

The four datasets used in this work can be downloaded from the links below. The relevant gold standard annotations used in the evaluations are included along with the datafiles

English Wikipedia articles about the major cities of the world
French Wikipedia articles about the major cities of the world
English Wikipedia articles about chemical elements
Cellphone reviews from phonearena.com