This section is part of
Patil, Ramesh S. Causal Representation of Patient Illness for Electrolyte and Acid-Base Diagnosis. MIT Lab for Comp. Sci. TR-267 (1981).

5. Diagnostic Problem Formulation and Information Gathering

The patient specific model (PSM) developed in chapter 4 was designed to provide the program with the capability of expressing its understanding about the patient's illness. However, due to the lack of complete knowledge about the patient and due to uncertainties in the medical knowledge, this understanding may be imprecise and incomplete. Our task is to identify these weaknesses and gather information that will help reduce or eliminate them. Viewed differently, these weaknesses identify a set of problems, all of which need to be solved in the process of diagnosis. The availability of a set of problems to work on simultaneously provides the problem solver with an opportunity to be efficient by abstracting common aspects of problems and by selecting an efficient order in which the problems are to be solved. This chapter examines several issues: (1) the process of identifying these weaknesses and formulating a diagnostic problem based on them, (2) the representation of this diagnostic problem and its decomposition into simpler problems, and (3) the evaluation of newly acquired information for apparent and real discrepancies.

The general medical knowledge in the program contains disease prototypes. However, given the facts about a patient along with a possible explanation, this prototypical information can be substantially constrained. For example, knowing that the patient has moderately severe metabolic acidosis, we can constrain the diseases hypothesized to account for the metabolic acidosis to be consistent with it, e.g., if salmonellosis is a hypothesized cause of this metabolic acidosis, it must be moderately severe and must have a duration of greater than two days. Secondly, only a small portion of the medical knowledge is relevant to any given diagnostic situation. For example, knowing that the patient's anion gap is normal, all the causes of metabolic acidosis that are not consistent with normal anion gap can be ruled out as being irrelevant to the diagnosis.19 We therefore introduce the notion of a diagnostic closure (called DC) which contains the medical knowledge local to the diagnostic situation, extracted from the medical data-base and made specific to the PSM. The DC is constructed by hypothetically projecting forward the states of a PSM to identify the consequences predicted by the states of the PSM and by projecting backwards the unaccounted for states of the PSM to identify diseases that can account for these states. Note that within each PSM all the findings and diseases complement each other in forming a single coherent explanation, while different PSMs provides alternate explanations which are mutually exclusive. Further, each DC contains alternatives within the context of the PSM associated with it. Thus, the diagnostic alternatives themselves are divided into groups, each group being consistent with a partially complete explanation of the patient's illness, and the different groups represent alternatives consistent with markedly different possible explanations.

We have argued that the ability to identify discrepancies in incoming information plays a crucial role in the diagnostic process. For example, in studying the problem solving behavior of clinicians, Kassirer and Gorry note:

“The physician appeared to use ... his concept of a disease (hypothesis), a state, or a complication as a model with which to evaluate new data from the patient. Such a model provides a basis for expectation; it identifies the relevant clinical features that should prove fruitful for further investigation.”

—Clinical Problem Solving [Kassirer78, page 250]

The ability to evaluate the implications of the incoming information is an important part of clinical practice, where the accuracy or the completeness of information cannot be taken for granted. We may be presented with a questionable finding which, if accepted, may require reformulation of the currently held diagnosis with far-reaching implications. However, it may be unwise to act on any such information unless it can be substantially corroborated, and its validity as a diagnostic sign checked out. For example, upon unexpectedly finding “substantial weight increase” in a patient it is wise to check if the two weights were taken on the same scale before jumping to the conclusion that the patient is “retaining water”. Inability to do so poses a serious problem for programs such as PIP. The problem arises because accepting such a finding may strongly favor hypotheses which erroneously predict the finding and against those hypotheses which correctly do not predict it, possibly causing the correct hypotheses to be dropped from further consideration. Thus, the program may not be able to come back and ask a simple question that could save it from taking a “garden path”.

The diagnostic closure discussed above provides the program with an ability to evaluate the consistency of a finding before it decides to accept it. For example, as new information is gathered, if the profile of the new information is consistent with that present in a DC, we know that this information is consistent with the PSM and lends positive support to the diagnosis under consideration. By the same token, if some information is not consistent with a DC under consideration, we know that this information can not be assimilated into the PSM without some modification. Finally, if the incoming information is not consistent with any of the DCs then we know that our entire line of reasoning is under question, and if the information is true, a major re-analysis of the program's understanding will have to be undertaken. Because such a situation can be identified, the program has an opportunity to suspend the global diagnostic processing and revert to local processing to validate the finding or to justify ignoring it.

The problem solvers in PIP and INTERNIST-I alternate between gathering a fact (based on their hypothesis lists) and re-evaluating the hypothesis lists (based on the new fact). Each fact is treated as an independent inquiry; the program does not group facts in a clinically meaningful and focused pursuit of diagnosis. This causes the information acquisition to become erratic and vulnerable to incomplete specification of information.20 Furthermore, the lack of commitment in pursuing any given information gathering strategy (e.g., discriminate, confirm) to completion diminishes their effectiveness. This problem can be solved by allowing the diagnostic problem solver to plan a group of questions focused around a single diagnostic task. The diagnostic closure already provides the dependencies necessary for such diagnostic planning. Diagnostic planning generally begins with the global task of discriminating between the alternate explanations provided by the set of PSMs. This task is successively decomposed into smaller tasks using diagnostic strategies of confirm, differentiate, rule-out, group-and-differentiate and explore. This results in a set of questions which, if answered, would help the program in solving the problem at hand.

It is common among physicians to “think out loud” while discussing a medical case with their colleague. For example, in analyzing protocols of medical diagnosis, Sussman notes:

“Thus, we have heard doctors react to new facts with such phrases as: ‘I expected that.’, ‘(It is) consistent with my assumptions.’, ‘I did not expect that ...’, ‘This new fact is making me very unhappy with my diagnosis.’. Among the most important reactions are ones of the form: ‘this does not really fit in. Perhaps he has ...’.”

— Some Aspects of Medical Diagnosis [Sussman73]

This thinking out loud plays an important role in communication between physicians. We require the program to have not only a similar ability evaluate the incoming information in comparison with its expectation, but also the ability to think out loud, which is essential in allowing the user physician to get a feel for the program's reasoning and understanding. The diagnostic closure allows the program to explain the spectrum of diagnostic alternatives consistent with a PSM, and the planned goal oriented diagnostic questioning allows the program to justify the motivation of the diagnostic reasoner in asking the questions, its expectations about the information being sought, and how this information relates to the hypotheses under consideration.

5.1 Global Diagnostic Cycle

The diagnostic algorithm for the ABEL system is:

  1. Presenting Complaints: The serum analysis and the initial complaints are analyzed. A small set of initial PSMs is created and added to the list of causal hypotheses (the CH-list).
  2. Rank Ordering Hypotheses: All PSMs in the CH-list are scored for the quality of explanation they provide for the patient's illness. The leading One or two of these PSMs are selected as possible explanations.
  3. Computing Diagnostic Closure: Diagnostic closures for the selected PSMs are computed and disease hypotheses in each DC are scored.
  4. Termination: if the diagnostic closures for all PSMs are null or if some PSM provides a complete and coherent account for the patient's illness then the current phase of diagnosis is complete.
  5. Diagnostic Information Gathering: Based on the number of DCs (i.e., the PSMs selected in step 2), a top level confirm or differentiate goal is formulated. Using diagnostic strategies, this goal is successively decomposed into simpler subproblems until individual questions are formulated.
  6. Re-structuring the PSM: If step 5 results in any new finding being known, then that finding is incorporated into the each of the PSMs by extending the structure of the PSMs to take the observed finding into account. Finally, this process is repeated starting at step 2.

In the remaining sections of this chapter we will study the individual steps of this algorithm.

5.2 Diagnostic Closure of a Hypothesis

A diagnostic closure (DC) describes that part of the medical knowledge that is directly relevant to the diagnostic exploration of a PSM. It contains, in addition to the PSM, causal pathways from the unaccounted findings in the PSM to some of the possible diseases (ultimate etiologies) that can account for them, and causal pathways from some of the states in the PSM and the hypothesized diseases to (predicted) observable findings. Stated differently, a DC contains alternative extensions needed to adequately complete the explanation provided by the PSM. The DC associated with a PSM is initially created by hypothetically projecting the states of the PSM. During the process of diagnostic planning, new DCs may be created by copying parts of an existing DC,21 and by further projecting the diseases or findings under consideration. Furthermore, when some new information is received during the execution of a diagnostic plan, the alternatives which are not consistent with the finding may be pruned from a DC. Figure 29 shows an example of a DC for a PSM with unaccounted metabolic acidosis and partially accounted hypokalemia. Note that metabolic acidosis and hypokalemia both can be accounted for by a single disease hypothesis: salmonellosis. However, if we assume that the unaccounted component of hypokalemia is caused by vomiting, we must find some other cause for the metabolic acidosis, e.g., acute renal failure or diabetes insipidus.

The diagnostic closure of a PSM provides us with the attributes of the hypothesized diseases and findings that are consistent with the PSM. It describes the program's diagnostic expectations against which the incoming information can be evaluated. Furthermore, by tracing the causal pathway from the hypothesized finding to the states in the PSM, we can determine how this finding relates to the PSM, and what intermediate assumptions are needed to assimilate this finding into the PSM. On the other hand, if the new finding is not consistent with any of the DCs under consideration then we know that this information is inconsistent with the program's current understanding. To accommodate a contradiction with the currently held hypothesis requires some major revision in the structure of the PSM. This process is computationally expensive and, if possible, should be avoided. As described above, ABEL has the ability to identify situations requiring a major revision, and to ask further questions to validate or invalidate the contradictory finding. However, when a contradictory finding is validated, ABEL abandons its current line of diagnostic inquiry and revises its PSMs. Clinical studies have shown that a physician when faced with a similar situation also attempts to avoid revising his diagnostic hypotheses. He attempts to disprove the offending piece of information or reconcile it by finding a sufficient excuse for ignoring it. On occasions, even after the validity of the contradictory finding is established, a physician may choose to ignore the finding until the current line of diagnostic questioning is completed. ABEL however, abandons its current line of diagnostic inquiry and revises its PSMs if a contradictory finding is validated. It does not have the ability to postpone consideration of any contradictory finding.

 

Fig. 29. An example of diagnostic closure

5.3 Scoring the PSM

The score of a PSM measures the degree of incompleteness of the PSM as an explanation of the patient's illness. It is computed by summing the seventies of partially and fully unaccounted states in the PSM. The scoring algorithm could be further improved by taking into consideration the need of a finding to be accounted for by an acceptable diagnosis. Furthermore, the program currently does not take into account the degree of explainability of a PSM. For example, a PSM may have a large number of unaccounted findings that can be accounted for by a single etiology, while another PSM may have only a few unaccounted findings but may require the invocation of multiple etiologies to account for them. Clearly, diagnoses with multiple etiologies are less desirable and much less frequent than diagnosis with a single etiology. The degree of explainability of a PSM is an important measure and should eventually be taken into account while scoring a PSM.22 Although the current method for computation of the score is primitive and should be extended using the additional factors discussed above, it appears to provide an acceptable level of discrimination between PSMs.

5.4 Scoring a Disease Hypothesis

Diseases are hypothesized to explain findings left unaccounted for by the PSM: a new disease is hypothesized only when it is capable of explaining some of the unaccounted findings. In this section we will consider a mechanism for scoring these hypotheses.

When a disease is hypothesized it may predict some consequences which may not fit well with the PSM, giving rise to new unexplained states. These additional unexplained consequences reduce the desirability of the hypothesis being considered. Furthermore, the hypothesized disease may predict some consequences which are as yet unobserved. These unobserved findings identify the additional information that can be used to confirm the disease hypothesis. For example, figure 30(a) shows a PSM with metabolic alkalosis and normokalemia, and vomiting hypothesized to account for the metabolic alkalosis. Figure 30(b) shows the findings predicted by the hypothesized vomiting. Figure 30(c) shows the consequences of adding the hypothesized vomiting to the PSM. The vomiting hypothesized in figure 30(a) explains an unaccounted for node, metabolic alkalosis, gives rise to a new unexplained node, hyperkalemia, and predicts an as yet unobserved finding, dehydration.

 

Fig. 30. An example of explained, unexplained and unaccounted findings

The usefulness of a disease hypothesis depends (ultimately) on its potential of being confirmed. This usefulness can be estimated using the explained, unexplained and unobserved findings associated with the hypothesis. Note, however, that the disease scores are computed for the purpose of ordering the diagnostic search, i.e., they provide a heuristic for performing a best- first search. The score of a disease hypothesis does not reflect the belief in the likelihood of the given disease being the correct diagnosis, but an estimate of its heuristic search utility. That is, given the available information, pursuing that disease hypothesis will lead efficiently to the final diagnosis. Although the two measures are similar and have often been confused with one another, they can be substantially different as more and more sophisticated search and error recovery techniques are used. In most of the previous programs this distinction was not made; thus even if a particular disease was a useful hypothesis, it could not be considered if most of its findings were as yet unknown. Further, it prevented these programs from accepting a working hypothesis which, even while having a low probability of being right, could lead efficiently to the right “ball-park”, which when reached would allow them to resort to more specific criteria to explore the restricted space.

In ABEL the disease hypotheses are ordered in two steps. First, they are grouped according to the number of unaccounted findings that can be accounted for by each hypothesis. Second, among those hypotheses that can account for the same number of findings, the diseases are rank-ordered by a score computed from the three factors discussed above. They are: (1) match, the number of causes and findings in the PSM that are consistent with the disease hypotheses;23 (2) mismatch, the number of causes and findings in the PSM that are inconsistent with the disease hypotheses; and finally (3) unknown, the number of unobserved findings predicted by the hypothesis which are not inconsistent with the PSM. A disease hypothesis is eliminated from immediate consideration (for one cycle of diagnostic inquiry) if the difference of match and mismatch is below an arbitrary threshold. The match combined with the unknown corresponds to the maximum possible score attainable by a given disease hypothesis. If this score goes below a threshold, the hypothesis can not be confirmed even if all the remaining unknown findings are resolved in favor of the hypothesis.

The above criterion for scoring the disease hypotheses is purely structural. It does not take probabilities of occurrences of different diseases into account. Incorporation of probabilities as a secondary scoring criterion should substantially improve the quality of the scoring mechanism. However, we believe that the criterion for evaluation of the heuristic value of the disease hypothesis as well as belief in a diagnosis should be primarily structural. Probabilistic scoring can be used effectively in differentiating between structurally similar hypotheses. However, primary reliance on probabilistic scoring without structural considerations (such as adequacy, coherence, match and mismatch), as has been the case with the first generation programs, is inadequate. Some of these inadequacies have been discussed in chapter 1.

5.5 Information Gathering Strategy

The process of diagnosis can be viewed as the process of discriminating between diagnostic alternatives. A strategy commonly used to achieve this called the differentiation strategy. Using the results of protocol analysis, researchers [Pople75a, Miller75, Elstein78, Kassirer79] have identified a larger class of diagnostic strategies which in addition to differentiate include confirm, rule-out, explore, refine etc. Although these additional strategies can be considered to be special cases of the differentiation strategy, in special situations they can provide substantial improvement in processing over differentiation.

The selection of an appropriate strategy is based primarily upon the syntactic structure of the diagnostic problem.24 One measure commonly used is the number of alternate hypotheses under consideration and their relative strength. The confirmation strategy is used when only one hypothesis is under consideration, or when one hypothesis is much more likely than all others. The rule-out strategy is the inverse of the confirm strategy; it is used to eliminate some hypothesis which is substantially less likely than all the others. Its major utility is in allowing final confirmation of some hypothesis, such as essential hypertension, by eliminating all other less likely alternatives or cutting a large group down to where differentiate strategy can be used. The differentiation strategy is used to discriminate between two (or three) hypotheses with similar belief factors. The above strategies are all used in the Internist-I program.

The remaining strategies, such as group-and-differentiate and refine, reformulate the diagnostic problem. The group-and-differentiate strategy is used when we have a large number of alternate hypotheses with similar belief factors. Here we need to discard a large number of hypotheses rapidly in order to focus our attention on a small number of alternatives. This can be achieved by partitioning the alternatives into a small number of groups according to some common characterization (e.g., common organ system involvement, etiology, temporal characteristic or pathophysiology) and then applying a differentiation strategy to rule in or rule out one of the groups, thus narrowing the hypothesis set substantially. The refinement strategy is used to refine a hypothesis about a general class of diseases into more specific hypotheses. Refinement results in a disjunctive set of hypotheses. Hence, refinement and, as we have seen group-and-differentiate are commonly followed by differentiation. Finally, the explore strategy is used when the patient description does not provide any well-defined diagnostic problem to solve. In such a situation we explore the findings systematically (e.g. review of systems) to uncover sufficient evidence to formulate a specific diagnostic problem.25

 

- e salmonellosis

metabolic - - - - may-cause acidosis-1 - - may-cause

acute-renal-failure

The use of these strategies in the first generation programs has been limited to a single application to identify the most useful finding. In this document we advocate viewing these strategies as decomposition operators that reformulate the diagnostic problem into a group of simpler problems. With this formulation we can repeatedly apply the diagnostic strategies to the top level diagnostic problem, successively decomposing it, until we reach subproblems that can be solved directly by asking single questions.

 

Fig. 31. Initial diagnostic closure for salmonellosis and acute renal failure

Consider the following simple example. Assume that we have a patient with moderately severe metabolic acidosis and are considering two possible causes of this metabolic acidosis, namely salmoneflosis and acute renal failure.26 The diagnostic closure consistent with this situation is shown in figure 31. We pursue this diagnostic closure by setting up a diagnostic problem as shown below.

Goal 1: differentiate Salmonellosis acute-renal-failure
              salmonellosis 
                  belief: likely
                  severity: moderate
                  duration: greater-than 2 days
              acute-renal-failure
                  belief: possible
                  severity: moderate
                  duration: greater-than 1 week 

To differentiate between salmonellosis and acute renal failure the program sets up a diagnostic closure for each of the possibilities (shown in figure 32). The first DC is constructed with the assumption that salmonellosis is the true cause of the observed metabolic acidosis, and the second with the assumption that acute renal failure is the true cause. The program then explores the consequences of its assumption in each case by projecting the disease hypotheses forward (shown in figure 33) and compares the two projections. From the projections it observes that salmonellosis and the acute renal failure predict different states of hydration for the patient. Based on this observation it formulates the next diagnostic problem shown below.

 

Fig. 32. Diagnostic closure separated for each possibility

 

Fig. 33. Diagnostic closures for each possibility projected forward
Goal 2: differentiate dehydration edema
              dehydration
                  caused-by: salmonellosis
                  belief: likely
                  severity: moderate
              edema
                  caused-by: acute-renal-failure
                  belief: possible
                  severity: moderate 

Let us assume that the state of hydration cannot be directly ascertained by inquiry and the program decides to decompose this goal into two subgoals, one each for confirming dehydration and edema.

Goal 3: confirm dehydration
              dehydration
                  caused-by: salmonellosis
                  belief: likely
                  severity: moderate 
Goal 4: confirm edema
              edema 
                  caused-by: acute-renal-failure
                  belief: possible
                  severity: moderate 

As dehydration is the more likely of the two (resulting from our initial assumption that salmonellosis is more likely than acute renal failure), the program chooses to pursue dehydration first. Since we have assumed that the state of hydration is unknown, the program must attempt to confirm it by gathering information like increased serum creatinine, hypotension, and poor tissue turgor. However, while formulating the goal for confirming serum creatinine, the program notices (using the second DC, figure 33) that the increased serum creatinine is also predicted by acute renal failure. The program incorporates this information in its goal structure. The subgoals formulated by the program in this situation are shown next.

 

Fig. 34. The goal tree

Goal 5: confirm serum-creatinine
              serum-creatinine
                  caused-by: dehydration
                  belief: likely
                  value: between 2 and 4
              serum-creatinine
                  caused-by: acute-renal-failure
                  belief: possible
                  value: between 3 and 7 
Goal 6: confirm mean-arterial-blood-pressure
              mean-arterial-blood-pressure
                  caused-by: dehydration
                  belief: likely
                  value: low 
              mean-arterial-blood-pressure
                  caused-by: acute-renal-failure
                  belief: possible
                  value: high 

The goal structure of the program when inquiring about the serum creatinine is shown in figure 34 (the bold lines indicate the flow of control). The goal structure encodes the program's rationale for asking the question: it explicitly encodes the program's reason for asking the question and the context in which the question is being asked. Therefore, if the user chooses to ask for an explanation at this point it is possible for the program to provide the following types of explanations. (The explanation provided here is a paraphrasing, in better English, of the program's actual explanation, which is produced by a very simple English generator [Swartout80].)

Explain: I am expecting the patient to have mild elevation in serum creatinine. Increase in serum creatinine may be caused by dehydration, which may be caused by salmonellosis. The salmonellosis may account for the observed metabolic acidosis. It is also the leading cause of metabolic acidosis under consideration. Increase in serum creatinine may also be caused by acute renal failure, which may cause metabolic acidosis~

Justify: I am exploring the cause of metabolic acidosis. I am differentiating between the two leading causes of metabolic acidosis, namely salmonellosis and acute renal failure. I am differentiating between dehydration and edema. The dehydration may be caused by salmonellosis and the edema by acute renal failure. I am pursuing dehydration. I am pursuing serum creatinine. Increase in serum creatinine may be caused by dehydration. Increase in serum creatinine may be caused by acute renal failure.

Viewing the individual diagnostic strategies as problem decomposition operators allows the program to set up the diagnostic goal structure described above. This goal structure not only allows the program to explain and justify its diagnostic behavior, but also provides a framework for evaluating the user response locally in the context of the expectations. It allows the program to react locally when a discrepancy is detected or when further exploration of the finding is needed, gracefully integrating the program's global disease-centered processing with the local symptom-centered processing.27

Each top level diagnostic inquiry, described above, is followed by incorporation of all the information gathered into the existing PSMs (using the structure building operators described in chapter 4), and the formulation of a new diagnostic problem. This process is repeated until an adequate diagnosis of the patient's illness is achieved or until all the information relevant to the diagnosis is exhausted.

Summarizing, in this chapter we have introduced the notion of a diagnostic closure, which contains the hypothesized diseases, findings and causal relations relevant to the diagnostic task at hand. A diagnostic closure is created by projecting appropriate states in the PSM or hypothesized diseases forward to identify their predicted consequences and backwards to identify their possible causes. Once we have this knowledge for each diagnostic possibility, we have the dependencies necessary to do diagnostic planning.

Diagnostic problems are generated by identifying the places were two or more hypotheses differ from one another in the interpretation of the findings. The set of problems identified is used in formulating a top level diagnostic goal for one cycle of diagnostic problem solving. The problem solver then generates a tree structured plan by successively decomposing this goal using strategies such as differentiate, confirm, group-and-differentiate, and rule-out. The diagnostic plan, in conjunction with the diagnostic closure, provides the context in which a question is asked, the program's reason for asking the question and its expectations about the possible responses to the question. This knowledge is used to guide the diagnostic inquiry as well as to provide explanation for the program's behavior.

Each cycle of diagnostic problem solving is viewed as an integral operation. During this cycle, the problem solver focuses on one top level diagnostic problem and attempts to solve it. This provides a focus for the interaction between the user physician and the program.

Finally, the information gathering process of each diagnostic cycle is followed by the revision of the structure of each PSM, making it consistent with the newly available information. Thus, at the end of each cycle of diagnostic inquiry, the PSMs are internally consistent, allowing the program to relinquish control to the superior management program (not implemented, see chapter 1) which could review the progress of diagnosis and possible therapies to decide between further diagnosis and immediate therapeutic intervention.


... on to Chapter 6

This section is part of

Patil, Ramesh S. Causal Representation of Patient Illness for Electrolyte and Acid-Base Diagnosis. MIT Lab for Computer Science Technical Report TR-267. October 1981. Also: Ph.D. Thesis, MIT Dept. of Electrical Engineering and Computer Science.

The document was reconstructed for the Web in April 2002 by Peter Szolovits.