Evolution of my Medical Artificial Intelligence Research

My research interests broadly include much of biomedical informatics.  Throughout my career, I have avoided a tightly focused concentration on a single topic. Instead, I have tried to define my research interests by the demands of health care and how they could be satisfied by computing approaches.  My graduate work in the early 1970’s focused on development of specialized application languages to support the computing needs of various disciplines. When I joined the MIT faculty in 1974, I met a group of doctors from Tufts/New England Medical Center who were trying to elucidate the thought processes of expert doctors as they performed diagnostic and therapeutic reasoning, and to build advisory programs that could help all doctors work as well as the best.  I found that the challenges of this field provided an excellent focus for my interests in artificial intelligence, knowledge representation and reasoning. I therefore made the commitment to learn some medicine (I know enough jargon to be able to play a doctor on TV!) and to understand how AI techniques can help computers “think through” complex medical decision problems. My early doctoral students made excellent contributions to such AI challenges: automated generation of explanations, qualitative modeling of physiology, meta-level control of computation, expert systems based on pathophysiologic models at different levels of detail, reasoning about individual patient preferences, and temporal reasoning.  We contributed to the work that became identified as the “expert systems” boom of the 1980’s, enriching the applicable techniques.

In medical AI, my first effort in 1975 was to re-engineer a previously-developed very inefficient diagnosis program for renal (kidney) diseases in order to make experimentation with it feasible. With my colleague Steve Pauker, I also wrote what turned out to be a very influential survey paper of various AI methods of reasoning in medical diagnosis. It explains different reasoning methods adopted by some of the early medical AI programs and points out challenges in more clearly defining the semantics of the knowledge they purport to contain. At this time, I became convinced that medical practice innately relies on feedback from what has been done before, so programs that advise human practitioners must work similarly.  They must repeatedly and incrementally re-assess diagnostic and therapeutic plans as time passes, underlying pathologies evolve, and therapies take effect, and thus new observations are acquired. My other insight from these early analyses was that very rarely does an interesting patient (one for whom help from a computer might be useful) suffer from a single, untreated disease.  Techniques that match symptoms to generic disease descriptions are rarely adequate, no matter whether they are described by rules, templates, or frames, because they fail to account for interactions.  A full pathophysiological theory could in principle represent all the interactions, say in the form of hundreds of differential equations and thousands of parameters. Alas, it cannot yield a useful clinical model because collecting the data needed to estimate all these parameters is infeasible.  We need instead a set of models that lie between simple symptom-diseases associations and such differential equation models. Patil’s thesis on diagnosis of acid-base and electrolyte disorders pioneered such an approach in the early 1980’s, and Long’s clinical-level heart disease models extended it.

The growing availability of real data, which was foreseen in the 1980‘s, began to be real in the 1990’s, and threatens to inundate us in the 21st century, has made a huge change in my own work and that of my students. We (and our computers) can now learn even complex associations between observables and patient states from large corpora of labeled data. Therefore, much of our research today focuses on finding novel ways to characterize huge collections of data and to develop predictive algorithms that do a good job of anticipating changes in a patient’s state from his or her previous condition and treatments being applied.  Thus far, we have been able to pick lots of low-hanging fruit by applying existing statistical and machine learning methods to the analysis of such data. However, I anticipate that new methods will also need to be developed in order to deal effectively with the great complexity of what happens to seriously ill patients.  For example, at present we tend to build predictive models using a set of features of the case. These normally include aspects of the patient’s medical history, their current problems, current and recent laboratory measurements, drugs and procedures. Such features are designed to summarize the timeline over which the actual data have evolved, by computing trend lines, averages and spreads over various time periods, etc.  These provide a useful, but hardly comprehensive account for how the patient’s illness has changed over time or how it has responded to previous attempts at treatment. In the 1990’s we have tried to develop Markov process and dynamic systems models of such phenomena, but the complexity of the models coupled with the computational difficulty of solving them had limited their effectiveness to very small example problems.  I believe that it is time to try such more sophisticated models again, so that they can exploit innately time-dependent phenomena such as pharmacokinetics. Also, because virtually every patient’s condition shows the effects of combining perhaps multiple disease processes and therapeutic interventions, partially observable Markov decision processes can provide a more nuanced interpretation of data than simpler feature-based predictive models.

In 1974, I had estimated that by the early 1980’s most large hospitals would have switched their practice to electronic collection, storage and retrieval of their medical records.  Because this has still not happened (as of 2010), I was clearly wrong.  I had based my guess on a simple extrapolation of the costs of keeping paper records vs. the costs of electronic storage.  I think those two cost curves did indeed cross in the early 1980’s, but I had not thought through the enormous costs (financial, institutional and human) of making such a switch. By the early 1990’s, I decided that I needed to develop a research focus on medical record systems, to better understand why it was so difficult to put them in place.  This led to three different lines of work in my group, each of which still continues.

First, we realized as soon as the World Wide Web was created that this formed the most appropriate technical basis for presenting and using medical information.  With our colleagues at Childrens Hospital Boston (CHB) and others in the Boston area, we formed a collaborative effort that demonstrated by 1994 the ability to view medical records from CHB anywhere on the internet (with suitable security) and by 1995 that ability to aggregate data from multiple institutions to present a longitudinal view of all data about a patient, even if collected at different hospitals and clinics.  We also implemented and published consensus methods for assuring the protection of the confidentiality of patients’ clinical records.

Second, also in 1994, we outlined a vision of life-long personal health care supported by a computer system (called Guardian Angel) that kept comprehensive records, educated patients about their health, provided them decision support, and served to connect them to providers and peer groups.  Although the vision is still far from being implemented, it did lead us to develop the earliest approaches to personally controlled health records, which have been influential on subsequent commercial developments such as Dossia, Google Health and Microsoft Health Vault. With the many anticipated changed in health care delivery and financing being debated in 2010, this vision is still relevant and cries out for continued research.

Third, through debates around 1993 about the propriety of adopting the social security number as a national health identifier, I felt challenged to study broader issues of patient privacy.  We proposed cryptographic identification schemes that permitted aggregation of clinical data about a patient only via the participation of the patient. We demonstrated the risks of naive de-identification methods, which leave in place enough unique data about individual patients to make them re-identifiable. We also demonstrated that properly pseudonymized data could still be used effectively to support secondary uses of those data without casually revealing the identities of the patients.  I also served on a National Research Council committee that reported on the poor state of protection of electronic health records in 1997 and inspired some provisions of the HIPAA privacy protections.  Later, I also served on an Institute of Medicine committee that helped to define the role of institutional review boards in protecting patient confidentiality in data studies. Much remains to be done, both technically and in policy, to protect patients and encourage data sharing.

In the past decade, I have focused on developing techniques to extract codified clinical data from narrative text and speech conversations.  This has proven important because of the Willie Sutton principle—that is where a lot of the data are.  Practitioners are able and willing to describe clinical phenomena with great sophistication in natural language, but not in formal representations.  It remains a challenge, however, to translate that narrative text to rich formal representations.  Even just identifying the various ways of expressing the same facts and translating these into a terminology such as SNOMED or ICD remains difficult.  Accounting for more subtle statements, such as accounts of how findings support or dispute diagnostic hypotheses, or contingent plans, seems beyond the state of the art.  My colleagues and I have worked on a highly flexible language processing framework to support research in this area. We have also extended the dictionary of a popular parser with medical terminology, built pattern matching methods to find descriptions of medications and dosages in text, identified signs, symptoms, diseases, treatments, tests and their results using both pattern-based and statistical methods, identified temporal and likelihood indicators about such facts, determined some relations among facts mentioned together, etc.  We have also developed two different approaches to de-identification of clinical data for research purposes, one based on dictionaries and patterns, the other on statistical machine learning techniques. These have been used to enable the re-use of large clinical datasets for research purposes in many projects.

The advent of the genomic revolution in the 1990’s promised to usher in an era of personalized medicine, where measurements of single nucleotide polymorphisms (or, eventually sequencing of individual genomes), genomic measurements of transcription, proteomic measurements of gene products, etc., could combine with clinical facts to provide highly customized diagnostic tools and methods to choose and optimize therapy for individual patients. We have contributed to this effort in the Partners Healthcare-based i2b2 project both through our natural language work and through efforts to bring together modeling techniques from statistics with those from artificial intelligence.

Current Projects

As of late 2010, I am focused on four ongoing research projects, although my interests continue on essentially all of the topics mentioned in my historical overview.  These are described in more detail in my group’s home page, but I give brief summaries here.

Capturing Patient-Provider Encounter through Text Speech and Dialogue Processing:

Create a system that captures primary medical data mentioned during an encounter between a health care provider and a patient.  We use speech-to-text technology to create an approximate transcript of both sides of such a conversation, use natural language processing and machine learning methods to extract relevant clinical content from the transcripts, organize these according to medical conventions, and display the data to both provider and patient to allow them to correct mistakes made by this process.  We are applying this in the Pediatric Environmental Health Clinic at Children’s Hospital Boston.

Integrating Data, Models and Reasoning in Critical Care

Develop techniques to collect, interpret, analyze and disseminate multi-channel data from intensive care collated with clinical notes and other patient data.  The three foci of our group's efforts are to extract meaningful data from textual records and to build algorithms that make sense of the clinical condition of the patient.

I2B2: Informatics for Integrating Biology and the Bedside

Develop a scalable informatics framework that will bridge clinical research data and the vast data banks arising from basic science research in order to better understand the genetic bases of complex diseases. This knowledge will facilitate the design of targeted therapies for individual patients with diseases having genetic origins.

SHARP 4: Secondary Use of Clinical Data

As part of a national collaborative group headed by Mayo Clinic, we are building tools to make it possible to re-use clinical data for purposes other than the patient care for which they were collected.  Our efforts include natural language processing to identify salient facts and relationships in narrative textual data, defining classification models that can identify specific phenotypes from patient records, and defining ontologies to organize the relevant medical knowledge needed for these tasks.