Guide to Health Informatics 2nd Edition

 

Enrico Coiera

 

 | Home | Order | About the book | Health Informatics | Sample Chapters | Reviews |

 

Chapter 17 - Healthcare terminologies and classification systems

 

The terms disease and remedy were formerly understood and therefore defined quite differently to what they are now; so, likewise, are the meanings and definitions of inflammation, pneumonia, typhus, gout, lithiasis, &c., different from those which were attached to them thirty years ago…It is evident ... that great mischief will in most cases ensue if, in such attempts at definition and explanation, greater importance is attached to a clear and determinate, than to a complete and comprehensive understanding of the objects and questions before us. In a field like ours, clearness can in general be purchased only at the expense of completeness and therefore truth.

 

Oesterlen, Medical Logic, (1855)

 

Coding and classification systems have a long history in medicine. Current systems can trace their origins back to epidemiological lists of the causes of death from the early part of the eighteenth century. François Bossier de Lacroix (1706-1777) is commonly credited with the first attempt to classify diseases systematically (ICD-10, 1993). Better known as Sauvages, he published the work under the title Nosologia Methodica.

Linnaeus (1707-1778) who was a contemporary of Sauvages also published his Genera Morborum in that period. By the beginning of the nineteenth century, the Synopsis Nosologiae Methodicae, published in 1785 by William Cullen of Edinburgh (1710-1790) was the classification in most common use.

It was John Graunt who, working about a hundred years earlier, is credited with the first practical attempts to classify disease for statistical purposes. Working on his London Bills of Mortality, he was able to estimate the proportion of deaths in different age groups. For example, he estimated a 36% mortality for liveborn children before the age of 6. He did this by taking all the deaths classified as convulsions, rickets, teeth and worms, thrush, abortives, chrysomes, infants, and livergrown. To these he added half of the deaths classed as smallpox, swinepox, measles, and worms without convulsions. By all accounts his estimate was a good one (ICD-10, 1993).

It has only been in the last few decades that these terminological systems have started to attract wide-spread attention and resources. The ever growing need to amass and analyse clinical data, no longer just for epidemiological purposes, has provided considerable incentive and resources for their development. Further, with the development of computer technology, there has been a belief that such wide-spread collection and analysis of data are now possible. In parallel, the requirement for clinicians to participate in that data collection has meant that they have had more opportunity to work with terminologies, and begin to understand their benefits and limitations.

In the previous chapter, the basic concepts of term, code, and classification were introduced. In this chapter, several of the major coding and classification systems in routine use in healthcare will be introduced, and their features compared. Some specific limitations of each system will be highlighted. In reality there are a large number of such systems in development and use, and they cannot all be identified here. The systems discussed are however representative of most systems in common use, and can serve as an introduction to them. Throughout, a historical perspective will be retained, since in this case the lessons of the past have deep implications for the present. The more general limitations of all terminological systems will be addressed in the following chapter.

 

17.1    The International Classification of Diseases

 

Purpose. The International Classification of Diseases (ICD) is published by the World Health Organisation (WHO). Currently in its tenth revision (ICD-10), its goal is to allow morbidity and mortality data from different countries around the world to be systematically collected and statistically analysed. It is not intended, nor is it suitable, for indexing distinct clinical entities (Gersenovic, 1995). The International Nomenclature of Diseases (IND) provides the set of recommended terms and synonyms that correspond to the entries classified in the ICD codes.

History. The ICD can trace its ancestry to the early days of healthcare terminologies. William Farr (1807-1883) became the first medical statistician for the General Register Office of England and Wales. Upon taking office, he found the Cullen classification in use, but that it had not been updated in accordance with medical advances, nor did it seem suitable for statistical purposes. In his first Annual Report of the Registrar General, he noted:

‘The advantages of a uniform statistical nomenclature, however imperfect, are so obvious, that it is surprising that no attention has been paid to its enforcement in Bills of Mortality. Each disease has, in many instances, been denoted by three or four terms, and each term has been applied to as many different diseases: vague, inconvenient names have been employed, or complications have been registered instead of primary diseases. The nomenclature is of as much importance in this department of enquiry as weights and measures in the physical sciences, and should be settled without delay. (ICD-10, 1993).’

Farr toiled hard at improving the classification, and by 1855, the International Statistical Congress adopted a classification based on the work of Farr, and Marc d’Espine of Geneva. Subsequently steered by Jaques Bertillon, this developed into the International List of Causes of Death. This was adopted in 1893, and continued to develop through the turn of the century and beyond, and ultimately evolved into the current ICD system.

In particular, the system was expanded to include not just causes of death, but diseases resulting in measurable morbidity. This expansion started with the urging of Farr. It was supported by Florence Nightingale, who in 1860 urged the adoption of Farr’s disease classification for the tabulation of hospital morbidity in her paper Proposals for a uniform plan of hospital statistics. In 1900 at the First International Conference to revise the Bertillon Classification, a parallel classification of diseases for use in statistics of sickness was finally adopted.

Level of acceptance and use. The ICD today is used internationally by WHO for comparison of statistical returns. It is also adopted by many individual countries in the preparation of their statistical returns. Most other major classification systems endeavour to make their systems compatible with ICD, so that data coded in these systems can be mapped directly to ICD codes. ICD thus acts as a defacto reference point for many healthcare terminologies.

Classification structure. The ICD-10 is a multiple-axis classification system. At its core, the basic ICD is a single list of three alphanumeric character codes. These are organised by category, from A00 to Z99 (excluding U codes which are reserved for research, and for the provisional assignment of new diseases of uncertain aetiology). This level of detail is the mandatory level for reporting to the WHO mortality database and for general international comparisons.

The classification is structured into 21 chapters, and the first character of the ICD code is a letter associated with a particular chapter (Table 17.1).

 

Table 17.1: The ICD-10 chapter headings (adapted from ICD-10, 1993).

 

 

 

 

 

 

 

 

 

 

 

 

Chapter I

Infectious and parasitic diseases

Chapter II

Neoplasms

Chapter III

Diseases of the blood and blood forming organs and certain disorders affecting the immune mechanism

Chapter IV

Endocrine, nutritional and metabolic diseases

Chapter V

Mental and behavioural disorders

Chapter VI

Diseases of the nervous system

Chapter VII

Diseases of the eye and adnexa

Chapter VIII

Diseases of the ear and mastoid process

Chapter IX

Diseases of the circulatory system

Chapter X

Diseases of the respiratory system

Chapter XI

Diseases of the digestive system

Chapter XII

Diseases of skin and subcutaneous tissue

Chapter XIII

Diseases of musculoskeletal system and connective tissue

Chapter XIV

Diseases of the genitourinary system

Chapter XV

Pregnancy, childbirth and the puerperium

Chapter XVI

Certain conditions originating in the perinatal period

Chapter XVII

Congenital malformations, deformations and chromosomal abnormalities

Chapter XVIII

Symptoms, signs and abnormal clinical and laboratory findings

Chapter XIX

Injuries, poisoning and certain other consequences of external causes

Chapter XX

External causes of morbidity and mortality

Chapter XXI

Factors affecting health status and contact with health services of a person not currently sick

 

 

Within chapters, the 3 character codes are divided into homogenous blocks reflecting different axes of classification. In Chapter I for example, the blocks signify the axes of mode of transmission and of the broad group of the infecting organism. Within Chapter II on neoplasms, the first axis is the behaviour of the neoplasm, and the next is its site. Within all blocks some codes are reserved for conditions not specified elsewhere in the classification.

When more detail is required, each category in ICD can be further subdivided, using a fourth numeric character after a decimal point, creating up to 10 subcategories. This is used, for example, to classify histological varieties of neoplasms. A few ICD chapters adopt five or more characters to allow further subclassification along different axes.

Since ICD continues to be used for ever-wider applications beyond its intent, the WHO decided in the 10th revision to develop the concept of a family of related classifications surrounding this core set. This ‘family’ contains lists that have been condensed from the full ICD, and lists expanded for speciality-based adaptations (Figure 17.1). It also contains lists that cover topics beyond morbidity and mortality. For example, there are classifications of medical and surgical procedures, disablement and so forth (Gersenovic, 1995).

 

Figure 17.1: The ICD family of disease and health-related classifications (adapted from ICD-10, 1993).

 

 

The International Classification of Functioning, Disability and Health (ICF) is a more recent member of the ICD ‘family’. While ICD-10 focuses on classifying a patient’s diagnosis, ICF is aimed at capturing a description of their capacity to function. ICF describes how people live with their health condition and describes body functions and structures, activities and participation. The domains are classified from body, individual and societal perspectives. Since an individual's functioning and disability occurs in a context, ICF also includes a list of environmental factors. The ICF is intended to assist with measuring health outcomes.

Limitations. The ICD has developed as a practical, rather than theoretically based, classification. There have been compromises between classification based on axes of aetiology, anatomical site and so on. There have also been adjustments made to it to meet the needs of different statistical applications beyond morbidity and mortality, for example social security. As such, the ICD exists as a practical attempt at compromise between various health care needs. Consequently, for many applications, finer levels of detail may still be needed, or other axes of classification required.

 

17.2    Diagnosis Related Groups

 

Purpose. Diagnosis Related Groups (DRGs) relate a patient’s diagnosis and treatment to the cost of their care (Murphy-Muth, 1987; Feinstein, 1988). Developed in the United States by the Health Care Finance Administration, DRGs were designed to support the calculation of federal reimbursement for healthcare delivered through the U.S. Medicare system.

A patient’s principal diagnoses and the procedures they are treated with during hospital admission are used to select the group in the DRG classification that most appropriately describes they overall type of care that has been delivered. Next the group selected is associated with a typical cost. Specifically, DRG funding requires the use of a cost weighting that is applied by the funding agency to determine the actual amount that should be paid to an institution for treating a patient with a particular DRG. The weightings are determined by a formula that is typically developed on a state or national basis.

DRGs are also used to determine an institution’s overall case-mix. The case-mix index helps to take account of the types of patient an individual institution sees, and estimates their severity of illness. Thus a hospital seeing the same proportion of patients as another, but dealing with more severe illness, will have a higher case-mix index. An institution’s case-mix index can then be used in the formula that determines reimbursement per individual DRG. Unsurprisingly different versions of the reimbursement formula favour different types of institution, and case-mix represents an area for ongoing debate and research.

History. In the mid 1970s the Centre for Health Studies at Yale University began work on a system for monitoring hospital utilisation review (Rothwell, 1987). Following a 1976 trial of a DRG system, it was decided to base the final system on the ICD-9-CM which would provide the basic diagnostic categories. The ICD-9-CM (clinical modification) classification was developed from the ICD-9 by the American Commission on Professional and Hospital Activities. It contains finer-grained clinical detail than the old ICD-9, and along with its successors developed in various countries for ICD-10, is intended for healthcare review and reimbursement use.

Level of acceptance and use. DRGs are used routinely in the United States for management review and payment for Medicare and Medicaid patients. Given the importance of reimbursement world-wide, DRGs have undergone ongoing development, and have been adopted in one form or another in many countries outside the USA, including Australia (AR-DRG), Canada (CMG) and countries of Europe and Asia.

Classification structure. Patients are initially assigned a code from ICD-9 CM or a clinical modification of ICD-10. ICD clinical modifications are multiaxial systems closely based on the ICD structure. Diagnoses are then partitioned into one of about 23 Major Diagnostic Categories (MDCs) according to body organ system or disease. The aim of this step is to group codes into similar categories that reflect consumption of resources and treatment (Figure 10.1). The categories are next partitioned based upon the performance of procedures, and on other variables such as the presence of complications and co-morbidities, patient age, and length of stay, before a DRG is finally assigned (Rothwell, 1987). There is thus a process of category reduction at each stage, starting from the many thousands of ICD codes to the few hundred DRGs:

 

ICD Þ MDC Þ DRG

 

Limitations. Given the local variations in clinical practice, disease incidence, patient selection, procedures performed, and resources, DRGs and case-mix indices will always only give approximate estimates of the true resource utilisation. For example, should a hospital that is developing new and expensive procedures be paid the same amount as an institution that treats the same type of patient with a more common and cheaper procedure? Should quality of care be reflected in a DRG? For example, if a hospital delivers good quality of care that results in better patient outcomes, should it be paid the same as a hospital that performs more poorly for the same type of patient?

As importantly, those institutions that are best able to create DRGs accurately are more likely to receive reimbursement in line with their true expenditure on care. There is thus an implication in the DRG model that an institution actually has the ability to accurately assemble information to derive DRGs and a case-mix index. Given local and national variations in information systems and coding practice, it is likely that institutions with poor information systems will be disadvantaged, unless the information infrastructure across a region is a ‘level playing field’.

Developments. DRGs are designed for use with inpatients. Accordingly, other systems have been developed for other areas of healthcare. Systems such as Ambulatory Visit Groups (AVGs) and Ambulatory Payment Classifications (APCs) have been developed for outpatient or ambulatory care in the primary sector. These are based upon a patient’s diagnosis, intervention, visit status and physician time. Given the increasing age of the population in western nations, there is a tremendous ongoing cost that comes from the chronic care needed by the elderly. Consequently, systems such as Resource Utilisation Groups (RUGs) and the Australian National Sub-Acute and Non-Acute Patient Classification (AN-SNAP) have been developed to help determine the usage of sub-acute and long-term care resources. RUGs are based upon the time spent by nursing home staff when caring for a patient. SNAP includes measures of functional ability.

 

17.3    The Read codes

 

Purpose. The Read codes (now simply called the Clinical Terms in the UK) are produced for clinicians, initially in primary care, who wish to audit the process of care. The Clinical Terms Version 3 (CTV3) is intended, like SNOMED International, to code events in the electronic patient record (O’Neil et al., 1995).

History. The Read codes were introduced in the UK in 1986 to generate computer summaries of patient care in primary care. In the subsequent revision Version 2, their structure was changed and based upon ICD-9 and OPCS-4, the Classification of Surgical Operations and Procedures. As Version 2 became increasingly inadequate, the UK’s Conference of Medical Royal Colleges, and the government’s National Health Service (NHS) established a joint Clinical Terms Project, comprising some 40 working groups representing the different specialities. This was subsequently joined by groups representing nurses and allied health professionals. Version 3 of the Read codes was created in response to the output of the Terms project.

Level of acceptance and use. Use of the Read codes is not mandatory in the UK. However, in 1994 it was recommended by the medical and nursing professional bodies as the preferred dictionary for clinical information systems. The Read codes have been purchased by the UK government and made Crown Copyright.

Classification structure. The Read codes have undergone substantive changes through their various revisions, altering not just the classification and terminological content, but also their structure. In Versions 1 and 2, Read was a strictly hierarchical classification system.

Read Version 3 is released in 2 stages and was a ‘superset’ of all previous releases, containing all previous terms, to allow backward compatibility with past versions. Version 3.0 is a kind of compositional classification system. Like SNOMED, a term can appear in several different ‘hierarchical structures’, classified against different axes. Unlike ICD or SNOMED, the codes themselves do not reflect a given hierarchy. They simply act as a unique identifier for a clinical concept. The ‘hierarchy’ exists as a set of links between concepts. Terms can inherit properties across these links. For example, ‘pulmonary tuberculosis’ may naturally inherit from a parent respiratory disorder or a parent infection term.

In Version 3.1, a set of qualifier terms such as anatomical site was added that can be combined with existing terms. When terms are composed, these composites exist outside of any strict hierarchy. To help in the combination of qualifiers with terms, they are grouped into templates. These capture some rules that help describe the range of possible qualifiers that a term in Read can take (Table 17.2).

Table 17.2: Example Read Version 3.1 template showing allowable combinations of terms with qualifier attributes, and attribute values (adapted from O’Neil et al., 1995).

 

Object

Applicable Attribute

Applicable values

Bone operation

Site

Bone, Part of Bone

Fixation of fracture

Reduction method

Percutaneous, open, closed

Fixation of fracture using intramedullary nail

Reaming method

Hand, powered rigid, powered flexible, etc.

Fixation of fracture using intramedullary nail

Nail Type

Flexible, Locking, Rigid, etc.

 

The Read Codes Drug and Appliance Dictionary is part of the Clinical Terms and covers medicinal products, appliances, special foods, reagents and dressings. The dictionary is designed for use in software that requires capture of medication and treatment data such as electronic patient records and prescribing systems.

Like other major systems, Read offers mapping to ICD-9 codes to permit international reporting, and in some cases also provides ICD-10 mapping. A set of Quality Assurance Rules have been developed for the Clinical Terms which are designed to check the clinical, drug and cross-mapping domains between the current and previous versions of the terms and other major terminologies like ICD-10, and for areas of overlap between the domains themselves (Schulz et al., 1998). Each QA rule is written to interrogate the various files that make up the Read Code releases and is designed to identify those concepts or terms that violate the basic structure of the Read Codes.

Although Read Version 3 does not overtly emphasise axes of classification like SNOMED, both systems allow terms to be linked to each other and to inherit properties across those links. Therefore the underlying potential for expressiveness is the same at the structural level. Differences in the number and type of terms, and the richness of interconnections between them are probably greater determinants of difference between these coding systems, than any underlying structural difference. The presence of a fixed hierarchy, as we find with ICD or SNOMED, carries certain benefits of regularity when exploring the system. It also imposes greater constraints when it is necessary to alter the system because of changes to the terminology. In Read, this burden of regularity begins to be shifted to the rules guiding the composition of terms.

Limitations. The Read templates for term composition are limited in their ability to control combination. A much richer language and knowledge base would be needed to regulate term combination (Rector et al., 1995).

 

17.4    SNOMED

 

Purpose. The Systematized nomenclature of medicine is intended to be a general-purpose, comprehensive and computer-processable terminology to represent and, according to its creators, will index “virtually all of the events found in the medical record” (Côté et al., 1993).

History. SNOMED was derived from the 1968 edition of the Manual of tumour nomenclature and coding (MONTAC) and the Systematized nomenclature of pathology (SNOP). SNOMED International (or SNOMED III) is a development of the second edition of SNOMED, published in 1979 by the College of American Pathologists (CAP).

Level of acceptance and use. SNOMED is reportedly used in over 40 countries, presumably largely in laboratories for the coding of reports to generate statistics and facilitate data retrieval. Although CAP is a not for profit organisation, in the past SNOMED license fees have often been significant and may have impeded its more widespread adoption.

Classification structure. SNOMED is a hierarchical, multi-axial classification system. Terms are assigned to one of eleven independent systematised modules, corresponding to different axes of classification (Table 17.3). Each term is placed into a hierarchy within one of these modules, and assigned a five or six digit alphanumeric code (Figure 17.2).

 

Table 17.3: The SNOMED International modules (or axes).

Module designator

Topography (T)

Morphology (M)

Function (F)

Diseases/Diagnoses (D)

Procedures (P)

Occupations (J)

Living Organisms (L)

Chemicals, Drugs & Biological Products (C)

Physical Agents, Forces & Activities (A)

Social Context (S)

General Linkage-Modifiers (G)

 

Terms can also be cross-referenced across these modules. Each code carries with it a packet of information about the terms it designates, giving some notion of the clinical context of that code (Table 17.4).

Figure 17.2: SNOMED Codes are hierarchically structured. Implicit in the code, tuberculosis is an infectious bacterial disease.

 


SNOMED also allows the composition of complex terms from simpler terms, and is thus partially compositional. SNOMED International incorporates virtually all of the ICD-9-CM terms and codes, allowing reports to be generated in this format if necessary.

Table 17.4: An example of SNOMED’s nomenclature and classification. Some terms (e.g. Tuberculosis) can be cross-referenced to others, to give the term a richer clinical context (adapted from Rothwell, 1995).

 

 

 

 

Nomenclature

Classification

Axis

T

+ M

+ L

+ F

= D

Term

Lung

+ Granuloma

+ M. tuberculosis

+ Fever

= Tuberculosis

Code