Guide to Health Informatics 2nd Edition

 

Enrico Coiera

 

 | Home | Order | About the book | Health Informatics | Sample Chapters | Reviews |

 

Chapter 25  - Clinical Decision Support Systems

 

From the very earliest moments in the modern history of the computer, scientists have dreamed of creating an “electronic brain”. Of all the modern technological quests, this search to create artificially intelligent (AI) computer systems has been one of the most ambitious and, not surprisingly, controversial.

It also seems that very early on, scientists and doctors alike were captivated by the potential such a technology might have in medicine (e.g. Ledley and Lusted, 1959). With intelligent computers able to store and process vast stores of knowledge, the hope was that they would become perfect ‘doctors in a box’, assisting or surpassing clinicians with tasks like diagnosis.

With such motivations, a small but talented community of computer scientists and healthcare professionals set about shaping a research program for a new discipline called Artificial Intelligence in Medicine (AIM). These researchers had a bold vision of the way AIM would revolutionise medicine, and push forward the frontiers of technology.

AI in medicine at that time was a largely US-based research community. Work originated out of a number of campuses, including MIT-Tufts, Pittsburgh, Stanford and Rutgers (e.g. Szolovits, 1982; Clancey and Shortliffe, 1984; Miller, 1988). The field attracted many of the best computer scientists and by any measure their output in the first decade of the field remains a remarkable achievement.

In reviewing this new field in 1984, Clancey and Shortliffe provided the following definition:

‘Medical artificial intelligence is primarily concerned with the construction of AI programs that perform diagnosis and make therapy recommendations. Unlike medical applications based on other programming methods, such as purely statistical and probabilistic methods, medical AI programs are based on symbolic models of disease entities and their relationship to patient factors and clinical manifestations.’

Much has changed since then, and today the importance of diagnosis as a task requiring computer support in routine clinical situations receives much less emphasis (Durinck et al., 1994). The strict focus on the medical setting has now broadened across the healthcare spectrum, and instead of AIM systems, it is more typical to describe them as clinical decision support systems (CDSS). Intelligent systems today are thus found supporting medication prescribing, in clinical laboratories and educational settings, for clinical surveillance, or in data-rich areas like the intensive care setting.

While there certainly have been ongoing challenges in developing such systems, they actually have proven their reliability and accuracy on repeated occasions (Shortliffe, 1987). Much of the difficulty experienced in introducing them has been associated with the poor way in which they have fitted into clinical practice, either solving problems that were not perceived to be an issue, or imposing changes in the way clinicians worked. What is now being realised is that when they fill an appropriately role, intelligent programmes do indeed offer significant benefits. One of the most important tasks now facing developers of AI-based systems is to characterise accurately those aspects of clinical practice that are best suited to the introduction of artificial intelligence systems.

In the remainder of this chapter, the initial focus will thus remain on the different roles CDSS can play in clinical practice, looking particularly to see where clear successes can be identified, as well as looking to the future. Much of the material presumes familiarity with Chapters two and eight. The next chapter will take a more technological focus, and look at the way CDSS are built. A variety of technologies including expert systems and neural networks will be discussed. The final chapters in this section look at several specialised topics where intelligent decision support is an essential component. We will look at the way CDSS can support the interpretation of patient signals that come off clinical monitoring devices, how it can assist in the surveillance for infectious diseases and public health challenges like bioterrorism, and how genome science is supported through bioinformatics.

 

25.1    AI can support both the creation and the use of clinical knowledge

 

Proponents of so-called ‘strong’ AI are interested in creating computer systems whose behaviour is at some level indistinguishable from humans (see Box 25.1). Success in strong AI would result in computer minds that might reside in autonomous physical beings like robots, or perhaps live in ‘virtual’ worlds like the information space created by something like the Internet.

An alternative approach to strong AI is to look at human cognition and decide how it can be supported in complex or difficult situations. For example, a fighter pilot may need the help of intelligent systems to assist in flying an aircraft that is too complex for a human to operate on their own. These ‘weak’ AI systems are not intended to have an independent existence, but are a form of ‘cognitive prosthesis’ that supports a human in a variety of tasks.

CDSS are by and large intended to support healthcare workers in the normal course of their duties, assisting with tasks that rely on the manipulation of data and knowledge. An AI system could be running within an electronic patient record system, for example, and alert a clinician when it detects a contraindication to a planned treatment. It could also alert the clinician when it detected patterns in clinical data that suggested significant changes in a patient’s condition.

Along with tasks that require reasoning with clinical knowledge, AI systems also have a very different role to play in the process of scientific research. In particular, AI systems have the capacity to learn, leading to the discovery of new phenomena and the creation of clinical knowledge. For example, a computer system can be used to analyse large amounts of data, looking for complex patterns within it that suggest previously unexpected associations. Equally, with enough of a model of existing knowledge, an AI system can be used to show how a new set of experimental observations conflict with the existing theories. We shall now examine such capabilities in more detail.

 

Box 25.1 - The Turing test

 

How will we know when a computer program has achieved an equivalent intelligence to a human? Is there some set of objective measures that can be assembled against which a computer program can be tested? Alan Turing was one of the founders of modern computer science and AI, whose intellectual achievements to this day remain astonishing in their breadth and importance. When he came to ponder this question, he brilliantly side-stepped the problem almost entirely.

In his opinion, there were no ultimately useful measures of intelligence. It was sufficient that an objective observer could not tell the difference in conversation between a human and a computer for us to conclude that the computer was intelligent. To cancel out any potential observer biases, Turing’s test put the observer in a room, equipped with a computer keyboard and screen, and made the observer talk to the test subjects only using these. The observer would engage in a discussion with the test subjects using the printed word, much as one would today by exchanging e-mail with a remote colleague. If a set of observers could not distinguish the computer from another human in over 50% of cases, then Turing felt that one had to accept that the computer was intelligent.

Another consequence of the Turing test is that it says nothing about how one builds an intelligent artefact, thus neatly avoiding discussions about whether the artefact needed to in anyway mimic the structure of the human brain or our cognitive processes. It really didn’t matter how the system was built in Turing’s mind. Its intelligence should only be assessed based upon its overt behaviour.

There have been attempts to build systems that can pass Turing’s test in recent years. Some have managed to convince at least some humans in a panel of judges that they too are human, but none have yet passed the mark set by Turing.

 

25.2    Reasoning with clinical knowledge

 

Knowledge-based systems are the commonest type of CDSS technology in routine clinical use. Also known as expert systems, they contain clinical knowledge, usually about a very specifically defined task, and are able to reason with data from individual patients to come up with reasoned conclusions. Although there are many variations, the knowledge within an expert system is typically represented in the form of a set of rules.

There are many different types of clinical task to which expert systems can be applied.

Alerts and reminders. In real-time situations, an expert system attached to a patient monitoring device like an ECG or pulse oximeter can warn of changes in a patient’s condition. In less acute circumstances, it might scan laboratory test results, drug or test order, or the EMR and then send reminders or warnings, either via immediate on-screen feedback or through a messaging system like e-mail. Reminder systems are used to notify clinicians of important tasks that need to be done before an event occurs. For example, an outpatient clinic reminder system may generate a list of immunizations that each patient on the daily schedule requires (Randolph et al., 1999).

Diagnostic assistance. When a patient’s case is complex, rare or the person making the diagnosis is simply inexperienced, an expert system can help in the formulation of likely diagnoses based on patient data presented to it, and the systems understanding of illness, stored in its knowledge base. Diagnostic assistance is often needed with complex data, such as the ECG, where most clinicians can make straightforward diagnoses, but may miss rare presentations of common illnesses like myocardial infarction, or may struggle with formulating diagnoses, which typically require specialised expertise.

Therapy critiquing and planning. Critiquing systems can look for inconsistencies, errors and omissions in an existing treatment plan, but do not assist in the generation of the plan. Critiquing systems can applied to physician order entry. For example, on entering an order for a blood transfusion a clinician may receive a message stating that the patient's haemoglobin level is above the transfusion threshold, and the clinician must justify the order by stating an indication, such as active bleeding (Randolph et al., 1999). Planning systems on the other hand have more knowledge about the structure of treatment protocols and can be used to formulate a treatment based upon a data on patient’s specific condition from the EMR and accepted treatment guidelines.

Prescribing decision support systems. One of the commonest clinical tasks is the prescription of medications, and PDSS can assist by checking for drug-drug interactions, dosage errors, and if connected to an EMR, for other prescribing contraindications such as allergy. PDSS are usually well received because they support a pre-existing routine task, and as well as improving the quality of the clinical decision, usually offer other benefits like automated script generation and sometimes electronic transmission of the script to a pharmacy.

Information retrieval. Finding evidence in support of clinical cases is still difficult on the Web, and intelligent information retrieval systems can assist in formulating appropriately specific and accurate clinical questions, they can act as information filters, by reducing the number of documents found in response to a query to a Web search engine, and they can assist in identifying the most appropriate sources of evidence appropriate to a clinical question. More complex software ‘agents’ can be sent to search for and retrieve information to answer clinical questions, for example on the Internet. The agent may contain knowledge about its user’s preferences and needs, and may also have some clinical knowledge to assist it in assessing the importance and utility of what it finds.

Image recognition and interpretation. Many clinical images can now be automatically interpreted, from plane X-rays through to more complex images like angiograms, CT and MRI scans. This is of value in mass-screenings, for example, when the system can flag potentially abnormal images for detailed human attention.

There are numerous reasons why more CDSS are not in routine use (Coiera, 1994). Some require the existence of an electronic patient record system to supply their data, and most institutions and practices do not yet have all their working data available electronically. Others suffer from poor human interface design and so do not get used even if they are of benefit.

Much of the initial reluctance to use CDSS simply arose because they did not fit naturally into the process of care, and as a result using them required additional effort from already busy individuals. It is also true, but perhaps dangerous, to ascribe some of the reluctance to use early systems upon the technophobia or computer illiteracy of healthcare workers. If a system is perceived by those using it to be beneficial, then it will be used. If not, independent of its true value, it will probably be rejected.

Happily, there are today very many systems that have made it into clinical use (Table 25.1). Many of these are small, but nevertheless make positive contributions to care. Others, like prescribing decision support systems, are in widespread use and for many clinicians form a routine part of their everyday practice.

 

Diagnostic and educational systems

 

In the first decade of AIM, most research systems were developed to assist clinicians in the process of diagnosis, typically with the intention that it would be used during a clinical encounter with a patient. Most of these early systems did not develop further than the research laboratory, partly because they did not gain sufficient support from clinicians to permit their routine introduction.

DXplain is an example of one of these clinical decision support systems, developed at the Massachusetts General Hospital (Barnett et al., 1987). It is used to assist in the process of diagnosis, taking a set of clinical findings including signs, symptoms, laboratory data and then produces a ranked list of diagnoses. It provides justification for each of differential diagnosis, and suggests further investigations. The system contains a database of crude probabilities for over 4,500 clinical manifestations that are associated with over 2,000 different diseases.

 

Table 25.1: A wide variety of expert systems have been placed into routine clinical use. These systems are typical examples.

SYSTEM                  

DESCRIPTION

ACUTE CARE SYSTEMS

 

(Dugas et al. 2002),

Decision support in hepatic surgery

POEMS (Sawar et al., 1992)

Post-operative care decision support

VIE-PNN (Miksch et al., 1993)

Parenteral nutrition planning for neonatal ICU

NéoGanesh (Dojat et al., 1996)

ICU ventilator management

SETH (Darmoni, 1993)

Clinical toxicology advisor

LABORATORY SYSTEMS

 

GERMWATCHER (Kahn et al.,1993)

Analysis of nosocomial infections

HEPAXPERT I, II (Adlassnig et al., 1991)        

Interprets tests for hepatitis A and B

Acid-base expert system (Pince, et al., 1990)

Interpretation of acid-base disorders

MICROBIOLOGY/PHARMACY (Morrell et al., 1993)

Monitors renal active antibiotic dosing

PEIRS   (Edwards et al., 1993)

Chemical pathology expert system

PUFF (Snow et al., 1988)

Interprets pulmonary function tests

Pro.M.D.- CSF Diagnostics (Trendelenburg, 1994)          

Interpretation of CSF findings

EDUCATIONAL SYSTEMS

 

DXPLAIN (Barnett et al., 1987)

Internal medicine expert system

ILLIAD (Warner et al., 1988)

Internal medicine expert system

HELP (Kuperman et al., 1991)

Knowledge-based hospital information system

QUALITY ASSURANCE AND ADMINISTRATION

 

COLORADO MEDICAID UTILIZATION REVIEW SYSTEM

Quality review of drug prescribing practices

MANAGED SECOND SURGICAL OPINION SYSTEM

Aetna Life and Casualty assessor system

MEDICAL IMAGING

 

PERFEX (Ezquerra et al., 1992)

Interprets cardiac SPECT data

(Lindahl et al. 1999).

classification of scintigrams

 

DXplain is in routine use at a number of hospitals and medical schools, mostly for clinical education purposes, but is also available for clinical consultation. It also has a role as an electronic medical textbook. It is able to provide a description of over 2,000 different diseases, emphasising the signs and symptoms that occur in each disease and provides recent references appropriate for each specific disease.

Decision support systems need not be ‘stand alone’ but can be deeply integrated into an electronic patient record system. Indeed, such integration reduces the barriers to using such a system, by crafting them more closely into clinical working processes, rather than expecting workers to create new processes to use them.

The HELP system is an example of this type of knowledge-based hospital information system, which began operation in 1980 (Kuperman et al., 1990; Kuperman et al., 1991). It not only supports the routine applications of a hospital information system (HIS) including management of admissions and discharges and order entry, but also provides a decision support function. The decision support system has been actively incorporated into the functions of the routine HIS applications. Decision support provides clinicians with alerts and reminders, data interpretation and patient diagnosis facilities, patient management suggestions and clinical protocols. Activation of the decision support is provided within the applications but can also be triggered automatically as clinical data is entered into the patient's computerised record.

 

Expert laboratory information systems

 

One of the most successful areas in which expert systems are applied is in the clinical laboratory. Practitioners may be unaware that while a pathologist checked the printed report they receive from a laboratory, the whole report may now have been generated by a computer system that has automatically interpreted the test results. Examples of such systems include the following.

·     The PUFF system for automatic interpretation of pulmonary function tests has been sold in its commercial form to hundreds of sites world-wide (Snow et al., 1988). PUFF went into production at Pacific Presbyterian Medical Centre in San Francisco in 1977, making it one of the very earliest medical expert systems in use. Many thousands of cases later, it is still in routine use.

·     A more general example of this type of system is PEIRS (Pathology Expert Interpretative Reporting System) (Edwards et al., 1993). During it period of operation, PEIRS interpreted about 80-100 reports a day with a diagnostic accuracy of about 95%. It accounted for about which 20% of all the reports generated by the hospital’s Chemical Pathology Department. PEIRS reported on thyroid function tests, arterial blood gases, urine and plasma catecholamines, hCG (human chorionic gonadotrophin) and AFP (alpha fetoprotein), glucose tolerance tests, cortisol, gastrin, cholinesterase phenotypes and parathyroid hormone related peptide (PTH-RP).

Laboratory expert systems usually do not intrude into clinical practice. Rather, they are embedded within the process of care, and with the exception of laboratory staff, clinicians working with patients do not need to interact with them. For the ordering clinician, the system prints a report with a diagnostic hypothesis for consideration, but does not remove responsibility for information gathering, examination, assessment and treatment. For the pathologist, the system cuts down the workload of generating reports, without removing the need to check and correct reports.

 

25.3    Machine learning systems can create new clinical knowledge

 

Learning is seen to be the quintessential characteristic of an intelligent being. Consequently, one of the driving ambitions of AI has been to develop computers that can learn from experience. The resulting developments in the AI sub-field of machine learning have resulted in a set of techniques that have the potential to alter the way in which knowledge is created.

All scientists are familiar with the statistical approach to data analysis. Given a particular hypothesis, statistical tests are applied to data to see if any relationships can be found between different parameters. Machine learning systems can go much further. They look at raw data and then attempt to hypothesise relationships within the data, and newer learning systems are able to produce quite complex characterisations of those relationships. In other words they attempt to discover humanly understandable concepts.

Learning techniques include neural networks, but encompass a large variety of other methods as well, each with their own particular characteristic benefits and difficulties. For example, some systems are able to learn decision trees from examples taken from data (Quinlan, 1986). These trees look much like the decision tress discussed in Chapter eight, and can be used to help in diagnosis.

Healthcare has formed a rich test-bed for machine learning experiments in the past, allowing scientists to develop complex and powerful learning systems. While there has been much practical use of expert systems in routine clinical settings, at present machine learning systems still seem to be used in a more experimental way. There are, however, many situations in which they can make a significant contribution.

·     Machine learning systems can be used to develop the knowledge bases used by expert systems. Given a set of clinical cases that act as examples, a machine learning system can produce a systematic description of those clinical features that uniquely characterise the clinical conditions. This knowledge can be expressed in the form of simple rules, or often as a decision tree. A classic example of this type of system is KARDIO, which was developed to interpret ECGs (Bratko et al., 1989).

·     This approach can be extended to explore poorly understood areas of healthcare, and people now talk of the process of ‘data mining’ and of ‘knowledge discovery’ systems. For example, it is possible, using patient data, to automatically construct pathophysiological models that describe the functional relationships between the various measurements. For example, Hau and Coiera (1997) describe a learning system that takes real-time patient data obtained during cardiac bypass surgery, and then creates models of normal and abnormal cardiac physiology. These models might be used to look for changes in a patient’s condition if used at the time they are created. Alternatively, if used in a research setting, these models can serve as initial hypotheses that can drive further experimentation.

·     One particularly exciting development has been the use of learning systems to discover new drugs. The learning system is given examples of one or more drugs that weakly exhibit a particular activity, and based upon a description of the chemical structure of those compounds, the learning system suggests which of the chemical attributes are necessary for that pharmacological activity. Based upon the new characterisation of chemical structure produced by the learning system, drug designers can try to design a new compound that has those characteristics. Currently, drug designers synthesis a number of analogues of the drug they wish to improve upon, and experiment with these to determine which exhibits the desired activity. By boot-strapping the process using the machine learning approach, the development of new drugs can be speeded up, and the costs significantly reduced. At present statistical analyses of activity are used to assist with analogue development, and machine learning techniques have been shown to at least equal if not outperform them, as well as having the benefit of generating knowledge in a form that is more easily understood by chemists (King et al., 1992). Since such learning experiments are still in their infancy, significant developments can be expected here in the next few years.

·      Machine learning has a potential role to play in the development of clinical guidelines. It is often the case that there are several alternate treatments for a given condition, with slightly different outcomes. It may not be clear however, what features of one particular treatment method are responsible for the better results. If databases are kept of the outcomes of competing treatments, then machine learning systems can be used to identify features that are responsible for different outcomes.

 

23.4   Clinical Decision Support Systems have repeatedly demonstrated their worth when evaluated

 

Many potential benefits from CDSS have been widely reported in the literature (Johnson & Feldman, 1995; Evans, 1996). The claims made fall into 3 broad categories (Sintchenko et al., 2002):

1.     Improved patient safety e.g. through reduced medication errors and adverse events and improved medication and test ordering;

2.     Improved quality of care e.g. by increasing clinicians’ available time for direct patient care, increased application of clinical pathways and guidelines, facilitating the use of up-to-date clinical evidence, improved clinical documentation and patient satisfaction; and

3.     Improved efficiency in health care delivery e.g. by reducing costs through faster order processing, reductions in test duplication, decreased adverse events, and changed patterns of drug prescribing favouring cheaper but equally effective generic brands.

The evaluation of CDSS are often poorly conceptualised and implemented (Cushman, 1997; Heathfield et al, 1998). In a systematic review of 55 CDSS evaluations, Sintchenko et al. (2003) found that less than a quarter involved a randomised controlled trial (Table 25.2).

Table 25.2: Evaluation methodologies used in CDSS evaluation studies (N=55) (Sintchenko et al., 2003).

Evaluation Methodology %

Evaluation Methodology

%

Before/after sample

27.27%

RCT

23.64%

Case-control

21.82%

Case study

16.36%

Qualitative

5.45%

Not done

3.64%

Longitudinal study

1.82%

 

 

Table 25.3: Limitations of evaluation components of CDSS studies (Sintchenko et al., 2003).

·        A focus on post-system implementation evaluation of users’ perceptions of systems.

·        A reliance upon retrospective designs which are limited in their ability to determine the extent to which improvements in outcome and process indicators may be causally linked to the CDSS.

·        Rare adoption of a comprehensive approach to evaluation where a multi-method design is used to capture the impact of CDSS on multiple dimensions.

·        Concentration on assessment of technical and functionality issues, which are    estimated to explain less than 20% of IT failures. Such evaluations have also failed to determine why useful and useable systems are often unsuccessful.

·        Expectations that improvements will be immediate.  In the short term there is likely to be a decrease in productivity.  Implementing information systems takes time and measuring its impact is complex thus a long-term evaluation strategy is required but rarely implemented. 

·        Almost none use naturalistic design in routine clinical settings with real patients and most studies involved doctors and excluded other clinical or managerial staff.

 

Evaluation of CDSS is complex, and there are many challenges in appropriately structuring such studies (Randolph et al., 1999). Consequently many studies fall into traps such as overemphasising user satisfaction as a measure of system success. Some of the most frequent limitations of CDSS studies are listed in Table 25.3. While CDSS are often justified on the basis of clinical benefit, evaluation often focuses on technical issues or on clinical processes.  Measurement of clinical outcomes is still sadly rare amongst evaluation studies, and most studies that do attempt to measure clinical impact do so through process variables (Table 25.4).

Nevertheless, the growing pool of evidence on the impact of CDSS in delivering improvements in the quality, safety and efficiency of health is promising, mainly in relation to alerts and reminders, and PDSSs. The following sections demonstrate not only the value of decision support systems in clinical practice, but also the complexity of the evaluation task, the ongoing gaps in our knowledge about their effectiveness, and the richness and variety of form of decision support.

Table 25.4: Impact measures chosen in CDSS evaluation studies (N=55) (Sintchenko et al., 2003).

 

 

Impact Measured

Impact not measured

 

Improvement demonstrated

(no. of studies)

No significant

impact

(no. of studies)

(% of studies)

Process variables

 

 

 

Confidence in decision

12

3

40 (73%)

Patterns of care

15

4

36 (66%)

Adherence to protocol

10

4

41 (75%)

Efficiency/Cost

10

2

43 (78%)

Adverse effects

12

3

40 (73%)