Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

If you need an accessible version of this item please contact JSTOR User Support

Jackknifing and Bootstrapping Goodness-of-Fit Statistics in Sparse Multinomials

Jeffrey S. Simonoff
Journal of the American Statistical Association
Vol. 81, No. 396 (Dec., 1986), pp. 1005-1011
DOI: 10.2307/2289075
Stable URL: http://www.jstor.org/stable/2289075
Page Count: 7
  • Download ($14.00)
  • Cite this Item
If you need an accessible version of this item please contact JSTOR User Support
Jackknifing and Bootstrapping Goodness-of-Fit Statistics in Sparse Multinomials
Preview not available

Abstract

Many goodness-of-fit problems can be put in the following form: n = (n1, ..., nk) is a multinomial (N, π) vector, and it is desired to test the composite null hypothesis H0: π = p(θ) against all possible alternatives. The usual tests used are Pearson's (1900) statistic X2 = ∑Ki = 1 [ ni - Npi(θ̂)]2/[ Npi(θ̂)] or the likelihood ratio statistic (Neyman and Pearson 1928) G2 = 2 ∑Ki = 1 ni log(ni/[ Npi(θ̂)]). Cressie and Read (1984) pointed out that both of these statistics are in the power family of statistics 2NIλ = (2/[λ(λ + 1)]) ∑Ki = 1 ni[(ni/[ Npi(θ̂)])λ - 1], with λ = 1 and 0, respectively; they suggested an alternative statistic with λ = 2/3. Although all of these statistics are asymptotically χ2 in the usual situation of K fixed and N → ∞, this is not the case if the multinomial is sparse; specifically, Morris (1975) showed that, under certain regularity conditions with K and N → ∞, both X2 and G2 are asymptotically normal (with different mean and variance) under a simple null hypothesis. Cressie and Read (1984) extended these results to the general 2NIλ family. Although these results have not been proven for composite nulls, it is certainly reasonable to expect that they continue to hold. Clearly, testing would require an estimate of the variance of the statistic that is valid under composite hypotheses in the sparse situation. In this article the use of nonparametric techniques to estimate these variances is examined. Simulations indicate (and heuristic arguments support) that although the bootstrap (Efron 1979) does not lead to a consistent variance estimate, the parametric bootstrap, the jackknife (Miller 1974) and a "categorical jackknife" (in which cells are deleted rather than observations) each leads to a consistent estimate. Simulations indicate that the jackknife is the nonparametric estimator of choice, and it is superior to the usual asymptotic formula for sparse data. Although these comparisons are based on the unconditional variance of the statistics, it is shown that the unconditional variance and the variance conditional on fitted parameter estimates are asymptotically equal if the underlying probability vector is from the general exponential family. Simulations also indicate that the jackknife estimate of variance is the estimator of choice in general parametric models for multinomial data.

Page Thumbnails

  • Thumbnail: Page 
1005
    1005
  • Thumbnail: Page 
1006
    1006
  • Thumbnail: Page 
1007
    1007
  • Thumbnail: Page 
1008
    1008
  • Thumbnail: Page 
1009
    1009
  • Thumbnail: Page 
1010
    1010
  • Thumbnail: Page 
1011
    1011