MIT home home

Quantifying Uncertainty in Biological Network Models

When constructing a computational model of a biological network, there is not just one model that is consistent with the available data, but many models with different parameters and different topologies. Quantifying the uncertainty in the topology and parameters is a necessary step toward quantifying the uncertainty in the predictions of the model. If the uncertainty is undesirably large, it will be necessary to collect additional data in order to reduce the uncertainty. We develop methods to quickly approximate the uncertainty in the model and to predict which experiments will reduce the uncertainty the most.

Figure 1 Approximating Parameter Space


The space of two parameters of a model can be visualized like in the green figure. The maximum likelihood parameter set is in the middle of the figure, but there is a region of parameter space (the green bob) where all the parameter sets within it fit the model well. The volume of this parameter space would be one way to qualify the uncertainty in the parameters. But for biological models, this space is nonlinear and hyperdimensional, making it very computationally expensive to compute the uncertainty exactly. To overcome this, we use a linear approximation of the space (the orange ellipse). While imperfect, the approximation is, as the figure suggests, pretty good.

We have shown that this approximation works well for designing experiments to efficiently reduce the parameter uncertainty to a desired level. The experiments that this algorithm chooses tend to provide information about the parameters that is complimentary to the current information - measurements are taken that are sensitive to the most uncertain parameters.

We have found that a linear approximation is also appropriate to make when computing the uncertainty in the topology of the model. Computing the probability distribution of a set of possible topologies given a set of data is extremely computationally expensive to accomplish with a Monte Carlo method, and the usual heuristics to approximate this distribution are notoriously unreliable. However, making the same approximation that we made for computing the parameter uncertainty allows for an analytical solution that is very fast to compute and, in tests, is adequate for knowing the uncertainty and for designing experiments to reduce that uncertainty.