Next: Computing the Differential Up: Medical Diagnosis Using a Previous: The Causal Disease

Probability Model

Given the form of the causal disease model, the next step is to represent the relation between each cause and effect. To simplify reasoning with the causal relations, we have imposed a uniform view of probabilistic causality on the model. That is, each causal relation is summarized as a probability of the cause producing the effect. Furthermore, causes precede effects and nodes are either true or false. By imposing this view, we are able to handle the circularities in the model in a consistent manner. The causation can be viewed as growing an acyclic subgraph from primary causes within the model graph. Anytime a circularity in the causation would be encountered, the effect node is already true and further causation is blocked. This causal view eliminates anomalous conditions such as circular causation with no initiating primary cause.

The probabilities on the links summarize several kinds of causality in the model. Some of the relations are randomly triggered causal events. For example, there is a probability that a patient with obstructed coronary arteries will have a myocardial infarction. The probability is partially dependent on other factors, such as the heart rate, blood pressure, and so forth, but the event is essentially random and binary.

Many other relations can be expressed as equations. For example, blood pressure is a function of cardiac output, systemic vascular resistance, and right atrial pressure. Usually however, not enough of the parameter values are known to constrain the equation and the easiest way to represent the potential for low cardiac output or low systemic vascular resistance to produce low blood pressure is to approximate the relationships as causal probabilities.

Other causal relations are dependent on duration or severity of the cause to produce the effect. For example, cardiac dilitation usually takes weeks to occur but a severe myocardial infarct can cause it in days or even hours. In such cases, the probability on the link is approximating the more complex relation of duration and severity. The probability could be defined as a function of duration and severity, but even without information about either, there must be an average time after an acute infarct when the diagnosis is done and an average severity and the probability reflects those averages.

Representing the states of cardiac output, blood pressure, and other measurable parameters with qualitative descriptors such as low, normal, and high is a compromise to constrain the diagnostic problem. The ranges of the parameters could be divided into much finer intervals, approximating their probability distribution functions. Such a scheme requires a high computational cost without much benefit because the distinctions between states are harder to recognize from the findings, and the differences between hypotheses become insignificant. The method used is to define the qualitative values by causal distinctions and let the relationship between the qualitative values and the quantitative measures be probabilistic. The criteria for distinct states is a difference in the causal structure. Thus, low cardiac output is different from normal cardiac output because low cardiac output can produce fatigue, inadequate renal perfusion, sympathetic response, or other effects. If there are different causes or effects, it is a different state. For the most part, simple designations such as low, normal, and high have been appropriate for the parameters. Since the parameter state is determined by the existence of causes and effects, the relationship to measured values is not always easily defined. The level of cardiac output at which the effects of low cardiac output start happening varies from patient to patient. The program handles this by specifying the probability that each qualitative state produces a measured range of the parameter:

Thus, 70%of low cardiac outputs have a cardiac index (cardiac output normalized for body size) below 2.3, 20%between 2.3 and 2.5, and so forth. Similarly, no normal cardiac outputs are below 2.3, but 5%are between 2.3 and 2.5, and so forth. These distributions allow the program to handle the cardiac index as evidence for the qualitative values of cardiac output in the same way as it would a categorical measurement value.

The probabilities on the links may be fixed, dependent on patient parameters, or dependent on the diagnostic hypothesis. In most cases there is not enough data or experience to recognize situations in which the probability of a causal link will change. Therefore the probability on the link is just a number. In other situations the probability varies with measurable patient parameters, usually age and sex. For example, pneumonia is less likely to produce a fever in the elderly than in a younger person and the probability on the link is:

pneumonia: p(fever) = (0.9 (range age 0.95 70 0.9 80 0.8 90 0.7))

If nothing is known about the age, 0.9 is used. If the patient is less than 70, the probability of fever is 0.95. If the age is between 70 and 80, 0.9 is used, and so forth.

Since the qualitative parameter state can represent a wide range of values, the probability of the state having an effect can be dependent on the actual parameter value. For example, the probability that high heart rate will cause low cardiac output is strongly dependent on the actual heart rate (as well as age). Many of the primary cause probabilities also vary with age and sex. The dependencies on patient parameters are handled by adjusting the probabilities on links before attempting to compute a differential diagnosis.

The probability of an effect being produced also varies with the number of causes. The approach we have taken is to assume independence of causes and use the ``noisy-or'' combining function unless there is specific knowledge about how a combination of causes changes the probability. That is, the probability of the effect is , where is the probability of the cause producing the effect by itself. There are in addition, two kinds of combinations that are treated differently. First, there are factors that increase the probability of a cause producing an effect, but can not produce that effect themselves. For example, a high heart rate will make a myocardial infarct more likely but can not produce it without another cause being present. The probabilistic contributions of these worsening or precipitating factors are combined as if they were causes when another cause is present, otherwise they are treated as zero. Similarly, there are factors that decrease the probability of a cause producing an effect. Usually these are therapies but there are also pathophysiological states that prevent or make other states less likely. For example, mitral stenosis usually produces high pressure in the left atrium (the chamber before the constricted valve). However, if there is tricuspid regurgitation, the right side of the heart is less likely to be able to maintain an elevated pressure. Such factors are combined to existing causes and multiplicatively decrease the causal probability. For example, if the the causes imply a 0.5 probability of producing the effect and there is a correcting factor that prevents the effect 80%of the time, the probability of the effect is 0.1.

With these mechanisms we are able to model the physiologic relationships in the medical domain as a probabilistic network. The network is the knowledge base from which diagnostic reasoning is done.

Next: Computing the Differential Up: Medical Diagnosis Using a Previous: The Causal Disease

wjl@MEDG.lcs.mit.edu
Fri Nov 3 17:21:37 EST 1995