Choosing the best method for verbal autopsy (VA) requires the appropriate metrics to assess a given method’s performance, and researchers from IHME and the University of Queensland undertook a study to determine these metrics.
Because there are many different methods of performing VA, including physician review and computer-automated methods, there needs to be a way to evaluate the performance of each method. Currently used metrics to assess performance of VA methods, such as sensitivity, specificity, and cause-specific mortality fraction error, do not provide a robust basis for comparing methods. This study was designed to test metrics to compare the performance of all types of VA methods to help identify the best methods for estimating causes of death in a population.
The authors show that VA methods need to be evaluated across a set of test datasets with widely varying cause-specific mortality fraction composition. They propose two metrics for assessing the performance of a given VA method. For assessing how well a method does at individual cause of death assignment, they recommend the average chance-corrected concordance. For testing how well a method does across a population, the authors recommend testing the accuracy of cause-specific mortality fractions. Performance of a VA method in estimating cause-specific mortality fraction by cause can be determined by examining the relationship between the estimated cause-specific mortality fraction calculated by a VA method and the true cause-specific mortality fraction in a population.
The authors performed simple simulations of populations to demonstrate that most metrics used in VA validation studies are sensitive to the cause-specific mortality fraction composition of the dataset being used to test a VA method. They also conducted simulations to show that the composition of the test dataset must be varied, demonstrating that an inferior method for assessing VA can actually produce better results than a superior method on the basis of only one test dataset.
The metrics examined in this study are crucial for objective assessment of the range of VA methods currently in use. By utilizing the most robust metrics available, the best methods will be discovered in order to most accurately determine causes of death in populations without vital registration systems. These metrics will provide a basis of comparison so that different VA methods can be compared across different studies and in different populations. These standardized metrics will also facilitate innovation of new methods by providing a clear answer if a new method is leading to improved performance in assigning causes of death through VA.