Diagnostic Causal Model

Next: Prediction Constraint Model Up: Knowledge Base Previous: Knowledge Base

Diagnostic Causal Model

The diagnostic causal model is a clinical level physiologic model of the cardiovascular system. The intent of the model is to represent the causal relations needed to generate explanations of the findings corresponding to the physician's understanding of the physiologic mechanisms. The model can be divided in several ways: parameters versus states, states versus measures, and knowledge base versus PSM.

The model consists of parameters representing diseases such as myocardial infarction (heart attack), therapies such as hydralazine (which reduces blood pressure), as well as the physiologic parameters such as heart rate, cardiac output, pressures in the left and right heart, renal function, and so forth. The disease parameters are either primary causes or the important diagnostic entities in the model. The primary causes do not require further causal explanation, although some do have causal explanations within the model (e.g. anemia, below). Therapies are included in the model because they often have side effects that contribute to the patient condition, may be needed to explain the absence of expected findings, and because they provide the knowledge base for the therapy selection process. Most of the parameters in the model are the intermediate physiologic parameters that detail the causal mechanisms from the diseases to the findings. In this model there are often lengthy causal chains sometimes exceeding ten parameters in length. The primary reason for the long causal chains is the need to reason about the effects of therapy as well as do diagnostic reasoning when multiple diseases are involved. Thus, the parameters are the interface between the diagnostic model and the model for therapy prediction.

The basic structures in the diagnostic model represent qualitative states of the parameters. The qualitative states are typically present and absent for diseases and other conditions and low, normal, and high for physiologic parameters. These parameter states are linked by probability relations. The probabilistic representation was chosen because of the nature of the information available to make a diagnosis. The quantitative and logical relationships among the parameters do not provide enough constraint to allow diagnosis or even very many useful conclusions, but the experiential knowledge of cardiologists about the frequency of diseases and their effects provides the knowledge necessary to draw useful conclusions in a probabilistic framework. Besides the parameter state nodes, representing the physiology, there are measures representing the observables: the history items, the symptoms, the physical exam findings, and laboratory results. These measure structures provide the interface for the interpretation of patient input data.

One of the parameter states is the presence of anemia. Part of the definition of anemia in the file that produces the knowledge base is:

This states that the probability of anemia without other causes is 0.04 if age is less than 70 and 0.08 if age is greater. If the age is unknown, 0.05 is used. If chronic renal insufficiency (kidney disease) is present, the probability that it causes anemia is 0.3. The indicates that it is a possible cause for the anemia. For some states there are also worsening factors, that can increase the likelihood of a state, but not cause it themselves, and correcting factors (usually therapies), that decrease the probability of a state. The measures are the observable facts about the patient. Values of the measures, such as dyspnea on exertion (shortness of breath), are linked to the parameter states by probabilities. The definition specifies that anemia causes dyspnea on exertion 50%of the time and would always be apparent if a CBC (complete blood count) were done. The measures also have definitions which contain such things as the probability that abnormal values might exist without there being a cause within the model, e.g., dyspnea on exertion can exist for reasons outside of the domain of the model. The measures themselves are used in the input menu to gather the patient data.

The measure for complete blood count has the following definition:

This is a test that can provide information about both the white cell count (WBC) and the hematocrit (HCT), so it can have more than one value and the legal combinations of values it can have are determined by the constraints clause. Both high WBC and high HCT can be caused by conditions that are outside of the domain of the model, so the specificity clause specifies the fraction of cases that need to be explained by the model. The program uses this number to estimate the prevalence of this condition as a primary entity. The format clause provides the information needed to print the values of the CBC measure as part of any textual representation.

In addition to the parameter, state, and measure structures, which constitute the fixed diagnostic knowledge base, there are structures that are computed from these when the model is loaded to generate the enhanced model. The enhanced model adds structures to represent the links between the states and structures to represent the causal paths through the states. The links are used to record the conditions of causality and simplify reasoning about causes and effects. The paths are generated to speed up the process of diagnosis and reasoning about the probabilities of states. Because of the high degree of connectivity in the model, there are many more path structures than any other type of structure but these allow for operations such as rapid intersections of causal paths and hypotheses that make diagnosis feasible. When the model is loaded, there are many additional slots in the structures that are computed from the rest of the knowledge base, such as the prevalence of findings (above) that make the job of the reasoning operators feasible.

The PSM is generated from the enhanced model to represent the patient when data is entered. The parts of the PSM used in diagnosis consist of structures representing the values of the measures entered on input, copies of the parameter states called nodes, and links connecting the nodes to each other and to the measure values representing the specific relations including the probabilities between nodes and the measure values as constrained by the input. Thus, the PSM consists of a personalized set of node and value structures connected by link structures, ready to accept the results of the reasoning operators.

It would be possible to include tables of probabilities for each combination of the possible values of the incoming links to each node but in practice a default rule for combining the probabilities on individual links is adequate. In fact, actual data on the probabilities of combinations of causes is very sparse. The combining rule used is the ``noisy-or''[19] except for worsening factors, which require another cause, and correcting factors, which decrease the probability. Thus, if causes are , worsening factors , correcting factors , and primary probability is , the probability of a node is: Similarly, each measure value has a probability of being produced by a subset of the nodes in the model, which is computed in the same way.

The diagnostic part of the model covers all of the common causes of heart failure. It also has non-cardiac diseases that cause the same symptoms or complicate the hemodynamic situation such as pulmonary, renal, liver, or thyroid diseases, anemia and infection. The model has been designed for clinical relevance. We have included the parameters and states that make sense to the clinician and provide the significant distinctions for diagnosis and therapy. As a result, diagnostic hypotheses explain findings in terms that make sense to the physician and the therapy predictions are easily related to clinical concerns.

Next: Prediction Constraint Model Up: Knowledge Base Previous: Knowledge Base

wjl@MEDG.lcs.mit.edu
Sat Nov 4 10:36:18 EST 1995