Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

If You Use a Screen Reader

This content is available through Read Online (Free) program, which relies on page scans. Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.

A Bayesian Analysis of Some Nonparametric Problems

Thomas S. Ferguson
The Annals of Statistics
Vol. 1, No. 2 (Mar., 1973), pp. 209-230
Stable URL: http://www.jstor.org/stable/2958008
Page Count: 22
  • Read Online (Free)
  • Download ($19.00)
  • Subscribe ($19.50)
  • Cite this Item
Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
A Bayesian Analysis of Some Nonparametric Problems
Preview not available

Abstract

The Bayesian approach to statistical problems, though fruitful in many ways, has been rather unsuccessful in treating nonparametric problems. This is due primarily to the difficulty in finding workable prior distributions on the parameter space, which in nonparametric ploblems is taken to be a set of probability distributions on a given sample space. There are two desirable properties of a prior distribution for nonparametric problems. (I) The support of the prior distribution should be large--with respect to some suitable topology on the space of probability distributions on the sample space. (II) Posterior distributions given a sample of observations from the true probability distribution should be manageable analytically. These properties are antagonistic in the sense that one may be obtained at the expense of the other. This paper presents a class of prior distributions, called Dirichlet process priors, broad in the sense of (I), for which (II) is realized, and for which treatment of many nonparametric statistical problems may be carried out, yielding results that are comparable to the classical theory. In Section 2, we review the properties of the Dirichlet distribution needed for the description of the Dirichlet process given in Section 3. Briefly, this process may be described as follows. Let X be a space and A a σ-field of subsets, and let α be a finite non-null measure on (X, A). Then a stochastic process P indexed by elements A of A, is said to be a Dirichlet process on (X, A) with parameter α if for any measurable partition (A1, ⋯, Ak) of X, the random vector (P(A1), ⋯, P(Ak)) has a Dirichlet distribution with parameter (α(A1), ⋯, α(Ak)). P may be considered a random probability measure on (X, A), The main theorem states that if P is a Dirichlet process on (X, A) with parameter α, and if X1, ⋯, Xn is a sample from P, then the posterior distribution of P given X1, ⋯, Xn is also a Dirichlet process on (X, A) with a parameter α + ∑n 1 δxi , where δx denotes the measure giving mass one to the point x. In Section 4, an alternative definition of the Dirichlet process is given. This definition exhibits a version of the Dirichlet process that gives probability one to the set of discrete probability measures on (X, A). This is in contrast to Dubins and Freedman [2], whose methods for choosing a distribution function on the interval [0, 1] lead with probability one to singular continuous distributions. Methods of choosing a distribution function on [0, 1] that with probability one is absolutely continuous have been described by Kraft [7]. The general method of choosing a distribution function on [0, 1], described in Section 2 of Kraft and van Eeden [10], can of course be used to define the Dirichlet process on [0, 1]. Special mention must be made of the papers of Freedman and Fabius. Freedman [5] defines a notion of tailfree for a distribution on the set of all probability measures on a countable space X. For a tailfree prior, posterior distribution given a sample from the true probability measure may be fairly easily computed. Fabius [3] extends the notion of tailfree to the case where X is the unit interval [0, 1], but it is clear his extension may be made to cover quite general X. With such an extension, the Dirichlet process would be a special case of a tailfree distribution for which the posterior distribution has a particularly simple form. There are disadvantages to the fact that P chosen by a Dirichlet process is discrete with probability one. These appear mainly because in sampling from a P chosen by a Dirichlet process, we expect eventually to see one observation exactly equal to another. For example, consider the goodness-of-fit problem of testing the hypothesis H0 that a distribution on the interval [0, 1] is uniform. If on the alternative hypothesis we place a Dirichlet process prior with parameter α itself a uniform measure on [0, 1], and if we are given a sample of size n ≥ 2, the only nontrivial nonrandomized Bayes rule is to reject H0 if and only if two or more of the observations are exactly equal. This is really a test of the hypothesis that a distribution is continuous against the hypothesis that it is discrete. Thus, there is still a need for a prior that chooses a continuous distribution with probability one and yet satisfies properties (I) and (II). Some applications in which the possible doubling up of the values of the observations plays no essential role are presented in Section 5. These include the estimation of a distribution function, of a mean, of quantiles, of a variance and of a covariance. A two-sample problem is considered in which the Mann-Whitney statistic, equivalent to the rank-sum statistic, appears naturally. A decision theoretic upper tolerance limit for a quantile is also treated. Finally, a hypothesis testing problem concerning a quantile is shown to yield the sign test. In each of these problems, useful ways of combining prior information with the statistical observations appear. Other applications exist. In his Ph. D. dissertation [1], Charles Antoniak finds a need to consider mixtures of Dirichlet processes. He treats several problems, including the estimation of a mixing distribution, bio-assay, empirical Bayes problems, and discrimination problems.

Page Thumbnails

  • Thumbnail: Page 
209
    209
  • Thumbnail: Page 
210
    210
  • Thumbnail: Page 
211
    211
  • Thumbnail: Page 
212
    212
  • Thumbnail: Page 
213
    213
  • Thumbnail: Page 
214
    214
  • Thumbnail: Page 
215
    215
  • Thumbnail: Page 
216
    216
  • Thumbnail: Page 
217
    217
  • Thumbnail: Page 
218
    218
  • Thumbnail: Page 
219
    219
  • Thumbnail: Page 
220
    220
  • Thumbnail: Page 
221
    221
  • Thumbnail: Page 
222
    222
  • Thumbnail: Page 
223
    223
  • Thumbnail: Page 
224
    224
  • Thumbnail: Page 
225
    225
  • Thumbnail: Page 
226
    226
  • Thumbnail: Page 
227
    227
  • Thumbnail: Page 
228
    228
  • Thumbnail: Page 
229
    229
  • Thumbnail: Page 
230
    230