edu.mit.nlp.segmenter
Class Document

java.lang.Object
  extended by edu.mit.nlp.segmenter.Document
Direct Known Subclasses:
DPDocument

public class Document
extends Object

Keeps track of counts and segments. Somewhat redundant with CountsManager, this should be reconciled.


Field Summary
 double[][] m_words
           
 
Constructor Summary
Document(double[][] sents, int N)
           
 
Method Summary
 double D()
           
 double D2()
           
 int[] getSPs()
           
 double[][] getThetas()
           
 double N()
           
protected  void printDurs()
          print the durations of each segment.
 void setPDur(double[] pdur)
           
 double T()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_words

public double[][] m_words
Constructor Detail

Document

public Document(double[][] sents,
                int N)
Parameters:
sents - an matrix representation of the document. there are sents.length sentences, and each row of sents is an array of size W (the size of the vocabulary).
N - is the number of segments. it's a shame that you have to prespecify it.
Method Detail

printDurs

protected void printDurs()
print the durations of each segment. i forget why this was important to do.


N

public double N()
Returns:
the number of segments

T

public double T()
Returns:
the number of sentences

D

public double D()
Returns:
the number of words in the vocabulary

D2

public double D2()
Returns:
the number of words with non-zero counts

getSPs

public int[] getSPs()
Returns:
a vector of the segmentation points

setPDur

public void setPDur(double[] pdur)

getThetas

public double[][] getThetas()
Returns:
the map estimate of the language model for each segment


Copyright © 2008 MIT. All Rights Reserved.