You are not currently logged in.
Access your personal account or get JSTOR access through your library or other institution:
Jackknifing and Bootstrapping Goodness-of-Fit Statistics in Sparse Multinomials
Jeffrey S. Simonoff
Journal of the American Statistical Association
Vol. 81, No. 396 (Dec., 1986), pp. 1005-1011
Stable URL: http://www.jstor.org/stable/2289075
Page Count: 7
Preview not available
Many goodness-of-fit problems can be put in the following form: n = (n1, ..., nk) is a multinomial (N, π) vector, and it is desired to test the composite null hypothesis H0: π = p(θ) against all possible alternatives. The usual tests used are Pearson's (1900) statistic X2 = ∑Ki = 1 [ ni - Npi(θ̂)]2/[ Npi(θ̂)] or the likelihood ratio statistic (Neyman and Pearson 1928) G2 = 2 ∑Ki = 1 ni log(ni/[ Npi(θ̂)]). Cressie and Read (1984) pointed out that both of these statistics are in the power family of statistics 2NIλ = (2/[λ(λ + 1)]) ∑Ki = 1 ni[(ni/[ Npi(θ̂)])λ - 1], with λ = 1 and 0, respectively; they suggested an alternative statistic with λ = 2/3. Although all of these statistics are asymptotically χ2 in the usual situation of K fixed and N → ∞, this is not the case if the multinomial is sparse; specifically, Morris (1975) showed that, under certain regularity conditions with K and N → ∞, both X2 and G2 are asymptotically normal (with different mean and variance) under a simple null hypothesis. Cressie and Read (1984) extended these results to the general 2NIλ family. Although these results have not been proven for composite nulls, it is certainly reasonable to expect that they continue to hold. Clearly, testing would require an estimate of the variance of the statistic that is valid under composite hypotheses in the sparse situation. In this article the use of nonparametric techniques to estimate these variances is examined. Simulations indicate (and heuristic arguments support) that although the bootstrap (Efron 1979) does not lead to a consistent variance estimate, the parametric bootstrap, the jackknife (Miller 1974) and a "categorical jackknife" (in which cells are deleted rather than observations) each leads to a consistent estimate. Simulations indicate that the jackknife is the nonparametric estimator of choice, and it is superior to the usual asymptotic formula for sparse data. Although these comparisons are based on the unconditional variance of the statistics, it is shown that the unconditional variance and the variance conditional on fitted parameter estimates are asymptotically equal if the underlying probability vector is from the general exponential family. Simulations also indicate that the jackknife estimate of variance is the estimator of choice in general parametric models for multinomial data.
Journal of the American Statistical Association © 1986 American Statistical Association