HST.950/6.872 Problem Set 5

Due 10/14/2004

กก

 1. Given the following data:

A B C
0 0 0
0 1 0
0 0 0
0 1 1
0 0 0
1 1 1
1 0 1
1 1 1
1 0 0
0 0 0

Please calculate P(C=1|A=1,B=1)

a. using relative frequency, i.e. without prior probability

b. in Bayesian way, with a uniformly distributed prior probability and equivalent sample size of 1

c. in Bayesian way, with a uniformly distributed prior probability and equivalent sample size of 4.

d. The restriction for a directed graphical model, or Bayesian Network, is that a node cannot be a parent of any of its ancestors. In other words, the graph has to be acyclic, or without close-loops. To find the most possible (directed) graphical model for these data, how many Bayesian Networks (directed graphical models) do we need to compare?

Bonus question: e. Expand your solution in d to N variables: how many Bayesian Networks do we have to compare to find the most probable model based on data of N variables? You don't have to give the exact close formula. If you can find a close upper bound of this number, it's fine. Bottom line: don't spend more than an hour on this problem.

กก

 2.