Estimating the distribution of external causes in hospital data from injury diagnosis

Published September 2, 2008, in Accident Analysis and Prevention (opens in a new window)


Research into a novel application of Bayesian inference shows that this method demonstrates considerable success in estimating the number of hospital admissions due to external causes based on injury diagnosis. The study, Estimating the distribution of external causes in hospital data from injury diagnosis, found that a Bayesian approach is a significant improvement for generating estimates of incidence for many external causes, when compared to the more traditionally utilized method of age-sex proportion distribution. The work was done in collaboration with scientists at Harvard University.

Research findings

Comparing the outcomes of the Bayesian methodology with age-sex proportional redistribution, researchers found that the Bayesian methodology demonstrated considerable success in estimating external causes that result from markedly different underlying injuries. The methodology was able to discriminate between fires, poisonings, drownings, and poisons, venoms, and bites. However, when the underlying injuries were similar (e.g., falls and subcategories of road traffic injuries, which are characterized by head and lower limb injuries), the method performed comparatively poorly.

The Bayesian algorithm assigned 75% of the set of known fire cases to fires. Similarly, 79% of the set of poisons, venoms, and bites, 64% of falls, and 61% of drownings were correctly predicted by the algorithm. In comparison, the algorithm was less successful in predicting car occupant and firearm fatalities, with correct estimation of only 17% and 19% of cases, respectively. This suggests that the types of injuries sustained do not carry sufficient information to differentiate between the different types of road traffic injuries and falls.

Analytical approach

Researchers used two patient-level datasets containing hospital discharge records in Mexico coded using International Classification of Diseases codes. They reduced the number of external cause codes by aggregating the various external causes into a smaller number of cause categories.

For the analysis, the researchers started with a prior probability distribution of external causes for each case (based on victim age and sex) and used a Bayesian inference to update the probabilities based on the victim’s injury diagnoses. They then validated the method on a trial dataset in which both external causes and injury diagnoses were known.

Research objective

Hospital discharge datasets are often the only source for estimating the incidence of non-fatal injuries in a population. External causes, which are one of the most important variables for guiding injury prevention priorities, are often poorly coded or missing in hospital discharge data. The nature of injuries sustained by the victim, however, is well documented.

Analytical methods that can map the nature of injuries to the underlying external causes, like those used in this study, can allow estimation of the distribution of hospital visits by external causes, enhancing the usefulness of hospital records for setting priorities and planning injury prevention programs. IHME is dedicated to identifying what makes people sick or injured, in order to help guide the design of appropriate prevention programs.

Recommendations for future work

The results of this study highlight the need for hospitals to incorporate accurate external cause coding in routine record keeping. While the Bayesian method used in this study yielded encouraging results, further improvements are possible. For instance, the researchers ignored other variables available in the dataset, including district of residence, urban or rural residence, and insurance status.

Future work should focus on optimal use of all the information available on individual records. Alongside these efforts to improve hospital data quality, analytical tools need to be developed to make the best possible use of the existing data systems.

Read full article (opens in a new window)


Bhalla K, Shahraz S, Naghavi M, Lozano R, Murray CJL. Estimating the distribution of external causes in hospital data from injury diagnosis. Accident Analysis and Prevention. 2008 Nov; 40(6):1822-1829.