|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.mit.nlp.segmenter.Document
edu.mit.nlp.segmenter.dp.DPDocument
public class DPDocument
Extends Document
with some methods specifically for the DP implementation of Bayesian segmentation.
Field Summary | |
---|---|
boolean |
m_dcm
|
boolean |
m_int_counts
|
Fields inherited from class edu.mit.nlp.segmenter.Document |
---|
m_words |
Constructor Summary | |
---|---|
DPDocument(double[][] sents,
int N,
boolean dcm)
|
Method Summary | |
---|---|
FastDigamma |
getDigamma()
|
FastGamma |
getGamma()
|
static void |
main(String[] argv)
Just does a unit test on some stuff |
protected void |
makeCumulCounts()
Builds up the cumulative counts, a representation that facilitates fast computation later. |
double |
segDCMGradient(int start,
int end,
double prior)
compute the gradient of the log-likelihood for a segment, under the DCM model |
double |
segLL(int start,
int end,
double prior)
compute the log-likelihood of a segment |
protected double |
segLLDCM(int start,
int end,
double prior)
compute the log likelihood of a segment under the DCM model |
double |
segLLExp(int start,
int end,
double logprior)
compute the log-likelihood of a segment, given the log of the prior |
double |
segLLGradientExp(int start,
int end,
double logprior)
compute the gradient of the log-likelihood for a segment, under the DCM model |
protected double |
segLLMAP(int start,
int end,
double prior)
compute the log likelihood of a segment under the MAP language model |
double |
segMAPGradient(int start,
int end,
double prior)
compute the gradient of the log-likelihood for a segment, under the MAP language model. |
void |
setDigamma(FastDigamma fastDigamma)
If you have multiple documents, you might want to share the cache for the digamma function across all documents. |
void |
setGamma(FastGamma fastGamma)
If you have multiple documents, you might want to share the cache for the gamma function across all documents. |
void |
setPrior(double prior)
|
Methods inherited from class edu.mit.nlp.segmenter.Document |
---|
D, D2, getSPs, getThetas, N, printDurs, setPDur, T |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public boolean m_dcm
public boolean m_int_counts
Constructor Detail |
---|
public DPDocument(double[][] sents, int N, boolean dcm)
sents
- the sentences in the documentN
- the number of segments. I forget what you do if this is unknowndcm
- whether you're using the DCM distribution (marginalizing the LMs). I haven't tested it with this set to false in a long time.Method Detail |
---|
public void setGamma(FastGamma fastGamma)
fastGamma
- the caching fastGamma objectpublic FastGamma getGamma()
public void setDigamma(FastDigamma fastDigamma)
fastDigamma
- the caching fastDigamma objectpublic FastDigamma getDigamma()
public void setPrior(double prior)
prior
- the value of the symmetric Dirichlet priorprotected void makeCumulCounts()
protected double segLLDCM(int start, int end, double prior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentprior
- the symmetric dirichlet prior to useprotected double segLLMAP(int start, int end, double prior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentprior
- the symmetric dirichlet prior to use
this could be sped up by keeping caches of the log partitions and
the log counts.public double segLL(int start, int end, double prior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentprior
- the symmetric dirichlet prior to usepublic double segLLExp(int start, int end, double logprior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentlogprior
- the log of the symmetric dirichlet prior to usepublic double segDCMGradient(int start, int end, double prior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentprior
- the log of the symmetric dirichlet prior to usepublic double segMAPGradient(int start, int end, double prior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentprior
- the symmetric dirichlet prior to usepublic double segLLGradientExp(int start, int end, double logprior)
start
- the index of the first sentence in the segmentend
- the index of the last sentence in the segmentlogprior
- the log of the symmetric dirichlet prior to usepublic static void main(String[] argv)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |