The traditional method of assigning causes of death to verbal autopsies (VAs), physician-certified verbal autopsy (PCVA), has been shown to have varying accuracy. Computer-coded verbal autopsy (CCVA) is a promising alternative to the standard approach of PCVA because of its high speed, low cost, and reliability. An innovative method of CCVA, the Random Forest (RF) method from machine learning, was found to outperform PCVA in almost all settings, according to a study by researchers from IHME and the Bill & Melinda Gates Foundation as part of the Population Health Metrics Research Consortium (PHMRC).

In CCVA, VA interviews are analyzed by computer to estimate causes of death in individuals and the distribution of causes of death over a population. This study, Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards, examines the accuracy of the RF method compared to a dataset of deaths with known causes and with PCVA, a widely used VA method.

Research objective

Machine learning methods are computer algorithms that infer patterns from examples. This study used a machine learning method that is a combination of several simple methods and assessed the performance of the new RF method,  using validated, gold standard deaths collected as part of the PHMRC gold standard verbal autopsy validation study. The PHMRC has undertaken the five-year study to develop a range of new analytical methods for VA and test these methods using data collected at six sites in four countries (Mexico, Tanzania, India, and the Philippines).
This study was designed to assess the accuracy of the RF method for adult, child, and neonatal VAs and compare the quality of the RF method with PCVA. It is part of ongoing work by IHME to develop the most accurate and efficient methods of predicting causes of death from verbal autopsies.
The Random Forest method was tested both with and without household recall of health care experience. Household recall of health care experience includes any information the caretaker has about the patient's medical treatment, including whether health workers provided documentation for the cause of hospitalization or cause of death.
The RF method was as good as or better than PCVA, both in correctly determining cause of death at the individual level and in accurately estimating the cause-specific mortality fraction at the population level. The only exception was nonsignificantly lower cause-specific mortality fraction accuracy for neonates with health care experience information included.
With health care experience, accuracy of the RF method in determining individual cause of death (called chance-corrected concordance) was 3.4 percentage points higher for adults, 3.2 percentage points higher for children, and 1.6 percentage points higher for neonates, compared to PCVA. The cause-specific mortality fraction accuracy was 0.097 higher for adults and children and 0.007 lower for neonates. Without health care experience, chance-corrected concordance with the RF method was 8.1 percentage points higher than PCVA for adults, 10.2 percentage points higher for children, and 5.9 percentage points higher for neonates. The cause-specific mortality fraction accuracy was higher for the RF method compared to PCVA by 0.102 for adults, 0.131 for children, and 0.025 for neonates.

Analytical approach

The RF method is based on a "decision tree," a structure for representing a complex function as branching decisions. The decision between two possibilities is made by starting from the top level and progressing to the next level, following the branch to the right if a symptom is endorsed and to the left if not. In the RF method, the decision trees are generated automatically from the training dataset without guidance from human experts.
This study used a multisite sample of 12,542 VAs collected as part of the PHMRC gold standard VA validation study. The RF method was compared to these gold standard cause of death assignments, which were based on strictly defined clinical diagnostic criteria, as well as to the results from PCVA on the same dataset.

Policy implications

It may take days for a team of physicians to complete a VA survey analysis, while a computer approach requires only seconds of processing on hardware that is currently affordably available. Results from this study show that the RF method outperforms PCVA and is preferable in terms of time and cost.
Using machine learning also leads to reliability, since the same interview responses will lead to the same cause of death assignment every time. This is an important advantage over PCVA, which can produce results of widely varying quality among different physicians, depending on their training and experience. The authors recommend RF as the technique of choice for analyzing past and current VAs.

Flaxman AD, Vahdatpour A, Green S, James SL, Murray CJL, the Population Health Metrics Research Consortium (PHMRC). Random Forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards. Population Health Metrics. 2011; 9:29.