edu.mit.nlp.segmenter
Class Document
java.lang.Object
edu.mit.nlp.segmenter.Document
- Direct Known Subclasses:
- DPDocument
public class Document
- extends Object
Keeps track of counts and segments. Somewhat redundant with CountsManager
, this
should be reconciled.
Constructor Summary |
Document(double[][] sents,
int N)
|
Method Summary |
double |
D()
|
double |
D2()
|
int[] |
getSPs()
|
double[][] |
getThetas()
|
double |
N()
|
protected void |
printDurs()
print the durations of each segment. |
void |
setPDur(double[] pdur)
|
double |
T()
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
m_words
public double[][] m_words
Document
public Document(double[][] sents,
int N)
- Parameters:
sents
- an matrix representation of the document. there are sents.length
sentences, and
each row of sents
is an array of size W (the size of the vocabulary).N
- is the number of segments. it's a shame that you have to prespecify it.
printDurs
protected void printDurs()
- print the durations of each segment. i forget why this was important to do.
N
public double N()
- Returns:
- the number of segments
T
public double T()
- Returns:
- the number of sentences
D
public double D()
- Returns:
- the number of words in the vocabulary
D2
public double D2()
- Returns:
- the number of words with non-zero counts
getSPs
public int[] getSPs()
- Returns:
- a vector of the segmentation points
setPDur
public void setPDur(double[] pdur)
getThetas
public double[][] getThetas()
- Returns:
- the map estimate of the language model for each segment
Copyright © 2008 MIT. All Rights Reserved.