%% Text written by Zak Kohane, modified by Jon Doyle, for AMRMC BAA The implementation and deployment of Electronic Medical Record Systems (EMRS) has lagged significantly behind available technological solutions. One important reason has been the lack of standardization of both the structure and content vocabulary of EMRS. We propose to address the lack of standardization by employing public-domain technologies to implement viewer/editors of multimedia compound documents and their transmission over the Internet without compromising confidentiality. To avoid unnecessary duplication of effort we will use open-architectured commercial tools, particularly to provide interfaces to heterogeneous data-bases. The overall objectives entail the following specific aims: 1. Develop a multi-media interface to the medical record that supports customization for specific clinical environments and needs. The HyperText Markup Language (HTML) will provide the foundation for this work. 2. Establish a canonical electronic medical record structure with supporting data abstraction processes to provide consistent views of medical information independent of underlying database structures. Emerging standards of medical record structures will be applied, and commercial products for database processing will be utilized to build the abstraction procedures. 3. Employ standardized medical vocabularies in the EMRS, and provide processes for translating/ mapping of local or non-standardized medical vocabularies. Using UMLS as a foundation, various methods of mapping local vocabularies to standardized medical vocabularies will be explored. 4. Provide communications support to allow access to the EMRS from multiple sites. Since support of several de facto Internet communications standards is integral to HTML, we plan to investigate the use of the Internet as the backbone communications structure for the EMRS. The application of encryption/security measures will be investigated. Emerging Standards Improvements in the management of health information anticipated with the growing use of computer information systems rely on the pervasive adoption of standard representations of patient specific clinical data and general medical knowledge. The standardization of computerized medical record systems has been the primary focus of the Message Standard Developers Subcommittee (MSDS) of the Health Informatics Planning Panel (HISPP), whose efforts are coordinated at the national level by the American National Standards Institute (ANSI). International approaches to standards have been directed by the International Standards Organization (ISO), of which ANSI is a member. Broad requirements for standardization in medical information technology have been summarized in a white paper presented by the American Medical Informatics Association [4] . In addition, the final report of the ANSI HISPP MSDS: Common Data Types Subcommittee is awaited. Common standards, which one leader in the field describes as Òthe lubricants of technical and economic progress,Ó have arisen slowly from the competing challenges of promoting uniformity and universality of the whole while maintaining the diversity and granularity of the components. Accomplishments and work in progress in this area can be divided into standardization in data interchange or syntax, and standardization of terminology or concepts. Since the early efforts were aimed to achieve the business goals (provider - payor communication), progress in and implementation of data interchange standards has surpassed the development and use of standardized medical vocabularies in the documentation (provider to provider communication) of medical information. State of the Art EMRS Comprehensive electronic medical record systems have been discussed for decades [5] , but comprehensively implemented in relatively few sites (e.g. Regenstrief [6] . HELP, [7] , TMR [8] , Veterans Affairs [9] , Columbia- Presbyterian [10] ). Where they have been broadly implemented, there is some evidence to suggest that these systems contribute to improved clinical care and reduced costs [11-13] . Subsequently there have been multiple efforts to develop workstation-based, graphics-intensive on-line patient charts (e.g. PWS at Hewlett Packard/Stanford, [14] DeSyGNER [15] , CWS at Children's Hospital [1,3] . Although some of these systems have been implemented at several sites, many of them work most productively and smoothly at the site at which they were developed. Typically, the task of "gluing" an externally-developed EMRS architecture to even a rudimentary "legacy" clinical data-base is substantial. One important reason for this difficulty is the lack of standardization noted by the National Library of Medicine's RFA (LM-94-002). Several standardization efforts are currently under way to define specifications for structure and content of EMRS (e.g. ASTM E1384-1991, ASTM E1238-1991, HL-7 [16] , Medix from IEEE, and ASTM Draft E31.12-4 (Revised 7/93)). Nonetheless, widespread agreement on these standards or their implementation is lagging. Structured Medical Vocabularies The simplest method of storing patient information would be to allow users to enter description that they feel are most accurate. While this use of uncontrolled text simplifies the entry of information, it complicates retrieval. Algorithms have been developed to allow retrieval of uncontrolled text with accuracy that rivals material indexed with a controlled vocabulary [17,18] . However, in the patient care domain, the highest accuracy possible is needed. For example, gynecologist may feel he or she is being accurate by describing a patient as having a Òcervical massÓ. A neurologist or orthopedist viewing the same record may draw a different and erroneous conclusion about the patientÕs problem. This kind of problem can only be avoided by controlling what terms are used in the content of the record. Numerous controlled vocabularies have been developed in the health care domain (MeSH, ICD9-CM, SNOMED III, Read, Gabrielli, NANDA). Each of these vocabularies has been developed for a specific purpose. None of these vocabularies has been developed to encompass all the purposes of an EMRS. For example, MeSH is one of the broader and better structured of the controlled vocabularies mentioned. It was developed specifically for indexing medical literature. Because there are few papers that discuss patient history and physical exam findings as the focus of the paper, MeSH is relatively weak in this area as compared to the Read vocabulary. At the same time, the Read vocabulary, which was developed to describe physical exam findings, is relatively weak in the area of procedure terms. Combining the strengths of each of the existing vocabularies might seem like a solution except that there are areas where the terminology in different vocabularies conflicts. For example, where ÔAbortionÕ in MeSH is meant to mean spontaneous abortion, the same term in CPT is assumed to mean induced or surgical abortion. Similar examples can be found by looking at terms for various substances that are found in patient blood samples. In one vocabulary they refer to the substance while in another vocabulary they refer to the act of measuring the substance. The NLM's Unified Medical Language System (UMLS) project is developing tools to address problems with linking multiple medical resources [19] . In the area of multiple medical vocabularies, they have developed a ÒMetathesaurusÓÊá(Meta) [20] . Meta is not a controlled vocabulary. It is a database of information about existing controlled vocabularies. The current version includes many of the vocabularies mentioned above. For medical resources that use the included vocabularies, Meta can provide an independent application enough information so that it can navigate through the vocabularies of those medical resources. In some instances, Meta can also allow a ÒtranslationÓ between the vocabularies of two applications. For medical resources that are not included in Meta, Meta includes a broad enough sample of terms in many domains so that a Ògood guessÓ can be made as to a term that will be included in the medical resource [21] . Another important tool developed by the UMLS project is the UMLS Semantic Network [20,22,23] . A generic semantic network is a knowledge structure which defines concepts primarily on the basis of their relation to each other. The UMLS Semantic Network is a small controlled vocabulary which provides information on how to categorize vocabulary terms and how those terms might be related to each other. As mentioned above, the context or purpose for which a particular term is being used can change our interpretation of the meaning of that term. By categorizing terms with Òsemantic typesÓ we can reduce the ambiguity of a term. For example, Meta contains two terms ÒAbortionÓ one of which is typed as a ÒDiseaseÓ while the other is typed as a ÒProcedureÓ. Using the Semantic Network and Meta together, a controlled vocabulary that is not included in Meta can be typed and linked to the existing Meta. These local extensions to Meta improve it's ability to make Ògood guessesÓ about the term a user wishes to use to interact with local medical resources.