Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

If You Use a Screen Reader

This content is available through Read Online (Free) program, which relies on page scans. Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.

A High Dimensional Two Sample Significance Test

A. P. Dempster
The Annals of Mathematical Statistics
Vol. 29, No. 4 (Dec., 1958), pp. 995-1010
Stable URL: http://www.jstor.org/stable/2236942
Page Count: 16
  • Read Online (Free)
  • Download ($19.00)
  • Subscribe ($19.50)
  • Cite this Item
Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
A High Dimensional Two Sample Significance Test
Preview not available

Abstract

The classical multivariate 2 sample significance test based on Hotelling's $T^2$ is undefined when the number $k$ of variables exceeds the number of within sample degrees of freedom available for estimation of variances and covariances. Addition of an a priori Euclidean metric to the affine $k$-space assumed by the classical method leads to an alternative approach to the same problem. A test statistic $F$ which is the ratio of 2 mean square distances is proposed and 3 methods of attaching a significance level to $F$ are described. The third method is considered in detail and leads to a "non-exact" significance test where the null hypothesis distribution of $F$ depends, in approximation, on a single unknown parameter $r$ for which an estimate must be substituted. Approximate distribution theory leads to 2 independent estimates of $r$ based on nearly sufficient statistics and these may be combined to yield a single estimate. A test of $F$ nominally at the 5% level but based on an estimate of $r$ rather than $r$ itself has a true significance level which is a function of $r$. This function is investigated and shown to be quite near 5%. The sensitivity of the test to a parameter measuring statistical distance between population means is discussed and it is shown that arbitrarily small differences in each individual variable can result in a detectable overall difference provided the number of variables (or, more precisely, $r$) can be made sufficiently large. This sensitivity discussion has stated implications for the a priori choice of metric in $k$-space. Finally a geometrical description of the case of large $r$ is presented.

Page Thumbnails

  • Thumbnail: Page 
995
    995
  • Thumbnail: Page 
996
    996
  • Thumbnail: Page 
997
    997
  • Thumbnail: Page 
998
    998
  • Thumbnail: Page 
999
    999
  • Thumbnail: Page 
1000
    1000
  • Thumbnail: Page 
1001
    1001
  • Thumbnail: Page 
1002
    1002
  • Thumbnail: Page 
1003
    1003
  • Thumbnail: Page 
1004
    1004
  • Thumbnail: Page 
1005
    1005
  • Thumbnail: Page 
1006
    1006
  • Thumbnail: Page 
1007
    1007
  • Thumbnail: Page 
1008
    1008
  • Thumbnail: Page 
1009
    1009
  • Thumbnail: Page 
1010
    1010