Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

The Art of Data Augmentation

David A. van Dyk and Xiao-Li Meng
Journal of Computational and Graphical Statistics
Vol. 10, No. 1 (Mar., 2001), pp. 1-50
Stable URL: http://www.jstor.org/stable/1391021
Page Count: 50
  • Download ($14.00)
  • Cite this Item
The Art of Data Augmentation
Preview not available

Abstract

The term data augmentation refers to methods for constructing iterative optimization or sampling algorithms via the introduction of unobserved data or latent variables. For deterministic algorithms, the method was popularized in the general statistical community by the seminal article by Dempster, Laird, and Rubin on the EM algorithm for maximizing a likelihood function or, more generally, a posterior density. For stochastic algorithms, the method was popularized in the statistical literature by Tanner and Wong's Data Augmentation algorithm for posterior sampling and in the physics literature by Swendsen and Wang's algorithm for sampling from the Ising and Potts models and their generalizations; in the physics literature, the method of data augmentation is referred to as the method of auxiliary variables. Data augmentation schemes were used by Tanner and Wong to make simulation feasible and simple, while auxiliary variables were adopted by Swendsen and Wang to improve the speed of iterative simulation. In general, however, constructing data augmentation schemes that result in both simple and fast algorithms is a matter of art in that successful strategies vary greatly with the (observed-data) models being considered. After an overview of data augmentation/auxiliary variables and some recent developments in methods for constructing such efficient data augmentation schemes, we introduce an effective search strategy that combines the ideas of marginal augmentation and conditional augmentation, together with a deterministic approximation method for selecting good augmentation schemes. We then apply this strategy to three common classes of models (specifically, multivariate t, probit regression, and mixed-effects models) to obtain efficient Markov chain Monte Carlo algorithms for posterior sampling. We provide theoretical and empirical evidence that the resulting algorithms, while requiring similar programming effort, can show dramatic improvement over the Gibbs samplers commonly used for these models in practice. A key feature of all these new algorithms is that they are positive recurrent subchains of nonpositive recurrent Markov chains constructed in larger spaces.

Page Thumbnails

  • Thumbnail: Page 
1
    1
  • Thumbnail: Page 
2
    2
  • Thumbnail: Page 
3
    3
  • Thumbnail: Page 
4
    4
  • Thumbnail: Page 
5
    5
  • Thumbnail: Page 
6
    6
  • Thumbnail: Page 
7
    7
  • Thumbnail: Page 
8
    8
  • Thumbnail: Page 
9
    9
  • Thumbnail: Page 
10
    10
  • Thumbnail: Page 
11
    11
  • Thumbnail: Page 
12
    12
  • Thumbnail: Page 
13
    13
  • Thumbnail: Page 
14
    14
  • Thumbnail: Page 
15
    15
  • Thumbnail: Page 
16
    16
  • Thumbnail: Page 
17
    17
  • Thumbnail: Page 
18
    18
  • Thumbnail: Page 
19
    19
  • Thumbnail: Page 
20
    20
  • Thumbnail: Page 
21
    21
  • Thumbnail: Page 
22
    22
  • Thumbnail: Page 
23
    23
  • Thumbnail: Page 
24
    24
  • Thumbnail: Page 
25
    25
  • Thumbnail: Page 
26
    26
  • Thumbnail: Page 
27
    27
  • Thumbnail: Page 
28
    28
  • Thumbnail: Page 
29
    29
  • Thumbnail: Page 
30
    30
  • Thumbnail: Page 
31
    31
  • Thumbnail: Page 
32
    32
  • Thumbnail: Page 
33
    33
  • Thumbnail: Page 
34
    34
  • Thumbnail: Page 
35
    35
  • Thumbnail: Page 
36
    36
  • Thumbnail: Page 
37
    37
  • Thumbnail: Page 
38
    38
  • Thumbnail: Page 
39
    39
  • Thumbnail: Page 
40
    40
  • Thumbnail: Page 
41
    41
  • Thumbnail: Page 
42
    42
  • Thumbnail: Page 
43
    43
  • Thumbnail: Page 
44
    44
  • Thumbnail: Page 
45
    45
  • Thumbnail: Page 
46
    46
  • Thumbnail: Page 
47
    47
  • Thumbnail: Page 
48
    48
  • Thumbnail: Page 
49
    49
  • Thumbnail: Page 
50
    50