You are not currently logged in.
Access your personal account or get JSTOR access through your library or other institution:
If You Use a Screen ReaderThis content is available through Read Online (Free) program, which relies on page scans. Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
Analysis of Correlated ROC Areas in Diagnostic Testing
Hae Hiang Song
Vol. 53, No. 1 (Mar., 1997), pp. 370-382
Published by: International Biometric Society
Stable URL: http://www.jstor.org/stable/2533123
Page Count: 13
Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
Preview not available
This paper focuses on methods of analysis of areas under receiver operating characteristic (ROC) curves. Analysis of ROC areas should incorporate the correlation structure of repeated measurements taken on the same set of cases and the paucity of measurements per treatment resulting from an effective summarization of cases into a few area measures of diagnostic accuracy. The repeated nature of ROC data has been taken into consideration in the analysis methods previously suggested by Swets and Pickett (1982, Evaluation of Diagnostic Systems: Methods from Signal Detection Theory), Hanley and McNeil (1983, Radiology 148, 839-843), and DeLong, DeLong, and Clarke-Pearson (1988, Biometrics 44, 837-845). DeLong et al.'s procedure is extended to a Wald test for general situations of diagnostic testing. The method of analyzing jackknife pseudovalues by treating them as data is extremely useful when the number of area measures to be tested is quite small. The Wald test based on covariances of multivariate multisample U-statistics is compared with two approaches of analyzing pseudovalues, the univariate mixed-model analysis of variance (ANOVA) for repeated measurements and the three-way factorial ANOVA. Monte Carlo simulations demonstrate that the three tests give good approximation to the nominal size at the 5% levels for large sample sizes, but the paired t-test using ROC areas as data lacks the power of the other three tests and Hanley and McNeil's method is inappropriate for testing diagnostic accuracies. The Wald statistic performs better than the ANOVAs of pseudovalues. Jackknifing schemes of multiple deletion where different structures of normal and diseased distributions are accounted for appear to perform slightly better than simple multiple-deletion schemes but no appreciable power difference is apparent, and deletion of too many cases at a time may sacrifice power. These methods have important applications in diagnostic testing in ROC studies of radiology and of medicine in general.
Biometrics © 1997 International Biometric Society