Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

Sliced Inverse Regression for Dimension Reduction

Ker-Chau Li
Journal of the American Statistical Association
Vol. 86, No. 414 (Jun., 1991), pp. 316-327
DOI: 10.2307/2290563
Stable URL: http://www.jstor.org/stable/2290563
Page Count: 12
  • Download ($14.00)
  • Cite this Item
Sliced Inverse Regression for Dimension Reduction
Preview not available

Abstract

Modern advances in computing power have greatly widened scientists' scope in gathering and investigating information from many variables, information which might have been ignored in the past. Yet to effectively scan a large pool of variables is not an easy task, although our ability to interact with data has been much enhanced by recent innovations in dynamic graphics. In this article, we propose a novel data-analytic tool, sliced inverse regression (SIR), for reducing the dimension of the input variable x without going through any parametric or nonparametric model-fitting process. This method explores the simplicity of the inverse view of regression; that is, instead of regressing the univariate output variable y against the multivariate x, we regress x against y. Forward regression and inverse regression are connected by a theorem that motivates this method. The theoretical properties of SIR are investigated under a model of the form, y = f(β1x, ..., βKx, ε), where the βk's are the unknown row vectors. This model looks like a nonlinear regression, except for the crucial difference that the functional form of f is completely unknown. For effectively reducing the dimension, we need only to estimate the space [effective dimension reduction (e.d.r.) space] generated by the βk's. This makes our goal different from the usual one in regression analysis, the estimation of all the regression coefficients. In fact, the βk's themselves are not identifiable without a specific structural form on f. Our main theorem shows that under a suitable condition, if the distribution of x has been standardized to have the zero mean and the identity covariance, the inverse regression curve, E(x ∣ y), will fall into the e.d.r. space. Hence a principal component analysis on the covariance matrix for the estimated inverse regression curve can be conducted to locate its main orientation, yielding our estimates for e.d.r. directions. Furthermore, we use a simple step function to estimate the inverse regression curve. No complicated smoothing is needed. SIR can be easily implemented on personal computers. By simulation, we demonstrate how SIR can effectively reduce the dimension of the input variable from, say, 10 to K = 2 for a data set with 400 observations. The spin-plot of y against the two projected variables obtained by SIR is found to mimic the spin-plot of y against the true directions very well. A chi-squared statistic is proposed to address the issue of whether or not a direction found by SIR is spurious.

Page Thumbnails

  • Thumbnail: Page 
316
    316
  • Thumbnail: Page 
317
    317
  • Thumbnail: Page 
318
    318
  • Thumbnail: Page 
319
    319
  • Thumbnail: Page 
320
    320
  • Thumbnail: Page 
321
    321
  • Thumbnail: Page 
322
    322
  • Thumbnail: Page 
323
    323
  • Thumbnail: Page 
324
    324
  • Thumbnail: Page 
325
    325
  • Thumbnail: Page 
326
    326
  • Thumbnail: Page 
327
    327