Abstract
While paraphrasing is critical both for interpretation and generation
of natural language, current systems use manual or semi-automatic
methods to collect paraphrases. We present an unsupervised learning
algorithm for identification of paraphrases from a corpus of multiple
English translations of the same source text. Our approach yields
phrasal and single word lexical paraphrases as well as syntactic
paraphrases.
Code
The source code for this work can be downloaded from the link below.
Source code