You are not currently logged in.
Access your personal account or get JSTOR access through your library or other institution:
Procedures for the Identification of Multiple Outliers in Linear Models
Ali S. Hadi and Jeffrey S. Simonoff
Journal of the American Statistical Association
Vol. 88, No. 424 (Dec., 1993), pp. 1264-1272
Stable URL: http://www.jstor.org/stable/2291266
Page Count: 9
Preview not available
We consider the problem of identifying and testing multiple outliers in linear models. The available outlier identification methods often do not succeed in detecting multiple outliers because they are affected by the observations they are supposed to identify. We introduce two test procedures for the detection of multiple outliers that appear to be less sensitive to this problem. Both procedures attempt to separate the data into a set of "clean" data points and a set of points that contain the potential outliers. The potential outliers are then tested to see how extreme they are relative to the clean subset, using an appropriately scaled version of the prediction error. The procedures are illustrated and compared to various existing methods, using several data sets known to contain multiple outliers. Also, the performances of both procedures are investigated by a Monte Carlo study. The data sets and the Monte Carlo indicate that both procedures are effective in the detection of multiple outliers in linear models and are superior to other methods, including methods based on robust fits (e.g., least median of squares residuals). In particular, the methods do not require presetting numbers of outliers to test for, do not require the efficiency level of an estimator, do not require Monte Carlo to determine cutoff values, are not highly computationally intensive, and are relatively resistant to both masking and swamping effects.
Journal of the American Statistical Association © 1993 American Statistical Association