Introduction

The incidence of traumatic spinal cord injuries (SCI) vary from 9.2 to 57.8 per million.1 This variation is partly because of differences in definition, classification and procedures of patient identification. The International Classification of Diseases (ICD) has become the standard diagnostic classification for epidemiological and health management purposes, and has been subjected to continuous update and revision. The current version, ICD-10, was introduced in 19932 and has been used for somatic diseases at our hospital since 1998. Traumatic SCI based on ICD-83, 4 and ICD-95, 6 have been evaluated, but less so for ICD-10.6

Although patient records should be the gold standard for accurate clinical information, administrative databases are widely searched, quickly and inexpensively, to obtain information for research as well as daily practice.7 The reliability of information from such administrative databases can be questioned because of the inaccurate coding of the disease.

Validity of the data is crucial for epidemiological research. In epidemiology comparison between time periods is of interest to identify the factors that determine the outcome and this paper illustrates how the changing diagnostic coding system influences the results using ICD 8, 9 and 10 to identify traumatic SCI. Such a diagnostic validity test may have direct implications for the planning of future research and for the clinical and administrative use of data from electronic databases.

Materials and methods

Definition of spinal cord injury

Traumatic SCI was defined in accordance with Kraus et al.8 as an acute, traumatic lesion of the spinal cord with varying degrees of motor and/or sensory deficit or paralysis. Although injuries of cauda equina were included, the definition excluded isolated injuries of other nerve roots.9 Transient paresis or impermanent deficits lasting less than 1 week were not included.

Inclusion criteria

Search codes describing traumatic spinal cord injury, both acute and sequelae, from ICD-8 and ICD-9 were included.3, 4, 5, 6, 8, 10 The codes from ICD-10 included codes that indicate a fracture or dislocation of the spinal column, concussion and edema of the spinal cord, trauma with SCI, other unspecified injuries of the spinal cord, injury of the cauda equina nerves and spinal cord, and sequelae of such injuries. Thus, we included 22 codes from ICD-10 to minimize possible under-reporting. All codes are listed in Table 1.

Table 1 ICD codes used in the electronic case identification

Haukeland University Hospital serves three counties with a population of approximately 1 million. Most of the patients were admitted immediately after injury, but some were initially treated in local hospitals. Persons with minimal neurologic impairment on admission were included if their length of stay was greater than 1 week,11, 12 and 1 day for patients staying in the Department of Neurosurgery. There is only one Department of Neurosurgery in the region. Some patients were treated at other hospitals shortly after surgery. Both primary and later stays were included.

Hospital discharge data

The electronic discharge registry contained personal information about each patient, including an 11-digit national identification number, dates of admission and discharge, codes for one primary diagnosis and unlimited secondary diagnoses and codes for all surgical procedures performed. The clinical diagnoses and procedures with corresponding codes are made by the attending physician and written in the case summary prepared routinely at the end of the hospitalization. Local medical secretaries transfer this information to the electronic discharge registry. Data in the hospital discharge registry were reported using ICD-8 in the period 1982–1986, ICD-9 in the period 1987–1998 and ICD-10 since August 1998.

Case finding and ascertainment

1080 patients with a diagnostic code suggesting traumatic SCI were discharged from Haukeland University Hospital in the period 1982–2001, identified by computer search from the discharge registry using SQL-Database version 7.3, Microsoft Access 2003. Data were analyzed using SPSS release 13.0.1 and STATA 9.0 (College Station, TX, USA). One author (EMH) reviewed all the complete patient records; and the other authors were consulted in all cases of doubt to reach a consensus.

To detect any missing records, we compared the electronic searches with data from a local database at the Department of Neurology covering the period 1952–2001.13 This database was extracted from a manual card-based system of discharge diagnoses at the Department. Most patients with persisting deficits would be transferred to the Department of Neurology, and the local database could act as a validation register.

Statistical analysis

Sensitivity was defined as the proportion of positives that were correctly identified, and specificity as the proportion of negatives that were correctly identified. The positive predictive value (PPV) is the proportion with a positive test result that was correctly diagnosed. The likelihood ratio (LR+) is the ratio of the probability of the specific test result in people who do have the disease compared to the probability in people who do not. The likelihood ratio of a positive test result (LR+)=sensitivity/(1−specificity). We assessed the sensitivity, specificity, PPV and LR+ for each ICD code. Data were analyzed by using the statistical software STATA 9.0. The Western Norway Regional Committee for Medical Research Ethics, the Data Inspectorate and the Directorate for Health and Social Affairs approved the study.

Results

During the study period, 1080 patients were discharged from Haukeland University Hospital with one or more of the search codes (Table 1). Reviewing the complete patient records, only 260 of these patients (24.1%) were verified as having a traumatic SCI. The confirmed SCI patients had a total of 1107 hospital stays. 240 of these were primary admissions and 867 were follow-ups. The median number of stays was 2.5.

Twenty-three patients with traumatic SCI included in the local database at the Department of Neurology, were not found in the electronic ICD-based search using multiple search codes. Of these, 22 did not have any relevant ICD search code, and one SCI patient was not at all registered in the hospital database.

The proportion of patients verified with traumatic SCI were 77.5% during the period with ICD-8, 28.5% during the period with ICD-9 and 19.5% during the period with ICD-10, as shown in Table 2. Only one code was used at discharge in 99.6%, 92.4% and 76.5% of cases using ICD-8, ICD-9 and ICD-10, respectively, and two codes in 0.4%, 6.6% and 19.4%. Three codes were used only for a minority of stays in the last two revisions, for 1.0% and 4.1% of stays respectively.

Table 2 Sensitivity, specificity, positive predictive value and likelihood ratio of search codes from ICD-8, ICD-9 and ICD-10

Table 2 shows the sensitivity, specificity, PPV and LR+ for each electronic search code for ICD-8, ICD-9 and ICD-10. The code 806.x (all subcodes included) from ICD-8 had the highest sensitivity and a high PPV. To achieve a higher sensitivity the number of search codes from each ICD version was reduced to the codes most specific for traumatic SCI. Using the codes 806.x, 907.2 and 952.x from ICD-9, the proportion of verified patients increased to 34.7%. Using a combination of seven codes from ICD-10 the proportion of verified patients increased to 88% (Table 2). The code 907.2 from ICD-9 had a low sensitivity, a high specificity, high PPV and high LR+. The seven selected codes from ICD-10 had all a low sensitivity but a high specificity, high PPV and high LR+. The LR+ for each single code varied from 0.39 to 65.98. The highest ratio was for code 907.2 from ICD-9. From ICD-10 the codes with highest LR+ were S14.0, S24.0, S24.1, S34.3 and T91.3 (Table 2).

Combining two search codes in ICD-8 (806.x, 958.x) identified all patients with traumatic SCI. Combining the three ICD-9 codes 806.x, 907.2, 952.x did not identify 9.4% (14/149) of the patients with traumatic SCI. Using a combination of seven ICD-10 codes (S14.0, S14.1, S24.0, S24.1, S34.1, S34.3, T91.3) did not identify 16.2% (17/105) of the patients with traumatic SCI. Using the two codes from ICD-8, 20 of 89 possible patients (22.5%) did not have a traumatic SCI.

Combining the three codes from ICD-9 and the seven codes from ICD-10, 65.3% and 12.0% of the included patients did not have a traumatic SCI, respectively. The most frequent diagnoses of the non-SCI patients were nervous system injuries, cerebral palsy, whiplash associated disorders, traumatic brain injury and spine fractures without nervous system deficit. A combination of seven codes from ICD-10 (S14.0, S14.1, S24.0, S24.1, S34.1, S34.3 and T91.3) gave the optimal result regarding sensitivity, specificity, PPV and LR+ and better results than any ICD-8 and ICD-9 combinations.

Discussion

This study shows a low diagnostic accuracy for traumatic SCI using searches of discharge diagnoses in ICD-8, 9 and 10. Only 260 of 1080 potential traumatic SCI patients (24.1%) were confirmed as having a traumatic SCI after review of the patient records. The proportion of verified traumatic SCI was 77.5% for ICD-8, 28.5% for ICD-9 and 19.5% for ICD-10. Reducing the number of search codes from each ICD-version to the codes most specific for traumatic SCI increased the specificity as expected but decreased the sensitivity.

There are substantial differences between ICD-8, 9 and ICD-10. However, all three versions were used during our study period. Therefore, the comparison of the ICD versions was made.

A systematic review of 21 studies comparing the routine discharge diagnoses with the original medical record in Great Britain found the median coding accuracy declined from ICD-7 (96.5%) to ICD-8 (87%) and ICD-9 (77%).14 Stroke coding accuracy was equally good with ICD-9 (90% correct) and ICD-10 (92% correct) in a study from Canada.15 In the study from Great Britain the coding accuracy was higher for high-prevalence conditions (median 97%) than for low-prevalence conditions (median 91%).14 Traumatic SCI is a low-prevalent condition.

The code 806.x in ICD-8 had high sensitivity and PPV. Combining code 806.x and 958.x further increased the PPV. A study from USA3 included four codes, 344, 805, 806 and 958. They found that the codes 806 and 958 had the highest validity. Combining the two codes 23.1% of SCI patients were not identified. Because the codes 344 and 805 formed a large part of the data set, a later study restricted the search to the codes 806 and 958.4 Only one of the 22 patients not found in the electronic search had been discharged with code 344 using only ICD-8. Adding this patient to the population 1 of 70 (1.4%) of the SCI patients were not identified using 806.x and 958.x.

The PPV was very low (0.35) using a combination of three codes from ICD-9 (806.x, 952.x, 907.2) due to a higher number of patients with other diseases. A study from USA5 found a higher overall PPV of 0.61 when using the two codes 806.x and 952.x. In our population, the code 806.x had the lowest specificity. PPV was lower for ICD-9 than for ICD-8. In UK hospital statistics diagnostic accuracy for SCI decreased from ICD-7 to ICD-8 and to ICD-9, probably because of more complex diagnostic classification.14 In Finland more than half of the ICD-9 codes seemed to be misleading when comparing with the medical records.6 When ICD-9 was used, some patients were coded with SCI, without any neurological deficits only fractures of the spine. They also found that after ICD-10 was introduced some cases with SCI were missed.

Epidemiology relying only on ICD-9 codes will lead to significant overestimation. In our study, the overestimation using ICD-9 was mainly caused by incorrect use of code 806.x; that is, the code for fracture of vertebral column with spinal cord lesion given to patients with fractures with no spinal cord lesion. Such miscoding makes the data unreliable for epidemiological research and health care planning.

We found that only 105 out of the 538 patients, who had one or more of the search codes from ICD-10, really had a traumatic SCI. Our selected ICD-10 discharge codes therefore grossly overestimate traumatic SCI as did the codes from ICD-8 and ICD-9. Most individual codes from ICD-10 had generally low sensitivity but high specificity, PPV and LR+. ICD-10 differs substantially from previous ICD versions, and requires the use of multiple codes. These changes have influenced the proportion of patients verified. The codes indicate the anatomical location (cervical, thoracic and lumbar), and different codes for acute injury and sequelae are introduced.

Our search covered many codes from ICD-10. The retrospective register-based epidemiological study from Finland6 used 11 ICD-10 codes (S14.0, S14.1, S14.2, S24.0, S24.1, S24.2, S34.0, S34.1, S34.2, S34.3 and T91.3), and estimated the SCI prevalence rate to be 28 per 100 000. The aim of the study was to identify all adult citizens (18 years or more) of Helsinki who had permanent sensory or motor deficits because of traumatic SCI. Cases were identified using the registers of the Kapyla Rehabilitation Centre, Helsinki University Central Hospital and the local organization for the disabled. They found that ICD-10 lacked sensitivity as 17 out of 152 patients were not identified.6 In our study, we did not identify 16.2% of traumatic SCI patients using a combination of seven codes.

An Australian study assessed the quality of ICD-10 coding in routinely collected hospital discharge data. Agreement of the principal diagnosis code was 85% at the 3-digit level in 1998–1999, improving to 87% in 2000–2001. The code paraplegia had a sensitivity of 82% and a PPV of 85%.16 However, the ICD diagnose G82 (paraplegia and tetraplegia) is a general diagnose describing a clinical picture and is not specific for traumatic SCI and not eligible for detecting patients. In our study, this code had low sensitivity and PPV. This clearly shows that the use of multiple codes is necessary to detect all patients.

In our study, the codes S24.0 (Concussion and edema of thoracic spinal cord) and S24.1 (Other and unspecified injuries of thoracic spinal cord) had the highest LR+, 45.35 and 65.98, respectively. Injuries of the thoracic spinal cord are less frequent than injuries of the cervical and lumbar parts. These diagnoses have high specificity. When reducing the number of ICD-10 codes to seven by only selecting codes with high specificity, the proportion of patients with traumatic SCI verified increased to 88%, at the expense of losing 16.2% of the patients with traumatic SCI identified by the broad search. Thus a combination of seven codes from ICD-10 gave a higher specificity, PPV and LR+ compared to a combination of three codes from ICD-9.

One of the major advantages of ICD-10 is that it is far more detailed than the previous ICD-versions. There are a total of 12 420 codes in ICD-10 compared with 6969 in ICD-9. This permits more detailed clinical information. ICD-10 differs, however, substantially from the previous versions, and the possibilities for incorrect coding will affect the statistics. An international consortium presented in 2006 a list of high-priority methodological areas for researchers using health administrative data. The four most highly ranked priorities were the documentation of data fields in each country's hospital administrative data, the translation of patient safety indicators from ICD-9 to ICD-10, the development and validation of algorithms to verify the logic and internal consistency of coding in hospital abstract data, and interventional studies to enhance coding quality.17 These measures may improve the quality of the databases in the future.

The first diagnostic ICD-code should always describe the primary cause of the hospital stay. Incomplete coding of secondary diagnoses leads to under-reporting of all chronic disorders.18 Many patients will have acute conditions or complications that take precedence in coding over chronic diseases. This may explain some missing patients also in our study. Causes of coding errors include lack of training and differences in coding tradition between departments. Other possibilities are ambiguity of terms, inaccurate or missing information in the patient record and variation and errors in clinical diagnoses. Local lists of limited code subsets may have been used, rather than the complete classification. The coding may also be biased by a tendency to repeat the codes used at previous stays because of convenience. Coding errors can occur during transcription by the physician or secretary.19 Physicians in several specialties are involved in treating traumatic SCI. S-codes describe injuries related to a single body region, whereas T-codes include injuries to multiple or unspecified body regions.

The latest ICD version proved to be most reliable when identifying patients with traumatic SCI. However, ICD data cannot be trusted without extensive validity checks for either research, health planning or administrative purposes.