If you need an accessible version of this item please contact JSTOR User Support

A High Dimensional Two Sample Significance Test

A. P. Dempster
The Annals of Mathematical Statistics
Vol. 29, No. 4 (Dec., 1958), pp. 995-1010
Stable URL: http://www.jstor.org/stable/2236942
Page Count: 16
  • Download PDF
  • Cite this Item

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

If you need an accessible version of this item please contact JSTOR User Support
A High Dimensional Two Sample Significance Test
Preview not available

Abstract

The classical multivariate 2 sample significance test based on Hotelling's $T^2$ is undefined when the number $k$ of variables exceeds the number of within sample degrees of freedom available for estimation of variances and covariances. Addition of an a priori Euclidean metric to the affine $k$-space assumed by the classical method leads to an alternative approach to the same problem. A test statistic $F$ which is the ratio of 2 mean square distances is proposed and 3 methods of attaching a significance level to $F$ are described. The third method is considered in detail and leads to a "non-exact" significance test where the null hypothesis distribution of $F$ depends, in approximation, on a single unknown parameter $r$ for which an estimate must be substituted. Approximate distribution theory leads to 2 independent estimates of $r$ based on nearly sufficient statistics and these may be combined to yield a single estimate. A test of $F$ nominally at the 5% level but based on an estimate of $r$ rather than $r$ itself has a true significance level which is a function of $r$. This function is investigated and shown to be quite near 5%. The sensitivity of the test to a parameter measuring statistical distance between population means is discussed and it is shown that arbitrarily small differences in each individual variable can result in a detectable overall difference provided the number of variables (or, more precisely, $r$) can be made sufficiently large. This sensitivity discussion has stated implications for the a priori choice of metric in $k$-space. Finally a geometrical description of the case of large $r$ is presented.

Page Thumbnails

  • Thumbnail: Page 
995
    995
  • Thumbnail: Page 
996
    996
  • Thumbnail: Page 
997
    997
  • Thumbnail: Page 
998
    998
  • Thumbnail: Page 
999
    999
  • Thumbnail: Page 
1000
    1000
  • Thumbnail: Page 
1001
    1001
  • Thumbnail: Page 
1002
    1002
  • Thumbnail: Page 
1003
    1003
  • Thumbnail: Page 
1004
    1004
  • Thumbnail: Page 
1005
    1005
  • Thumbnail: Page 
1006
    1006
  • Thumbnail: Page 
1007
    1007
  • Thumbnail: Page 
1008
    1008
  • Thumbnail: Page 
1009
    1009
  • Thumbnail: Page 
1010
    1010