## Access

You are not currently logged in.

# Bayesian Methods for Statistical Analysis OPEN ACCESS

BOREK PUZA
Stable URL: http://www.jstor.org/stable/j.ctt1bgzbn2

1. Front Matter (pp. i-iv)
3. Abstract (pp. ix-x)
4. Acknowledgements (pp. xi-xii)
5. Preface (pp. xiii-xiv)
6. Overview (pp. xv-xviii)
7. Bayesian methodsis a term which may be used to refer to any mathematical tools that are useful and relevant in some way toBayesian inference, an approach to statistics based on the work of Thomas Bayes (1701–1761). Bayes was an English mathematician and Presbyterian minister who is best known for having formulated a basic version of the well-knownBayes’ Theorem.

Figure 1.1 (page 3) shows part of the Wikipedia article for Thomas Bayes. Bayes’ ideas were later developed and generalised by many others, most notably the French mathematician Pierre-Simon Laplace (1749–1827) and the British astronomer Harold Jeffreys...

8. Consider a Bayesian model defined by a likelihoodf(y|θ)and a priorf(θ), leading to the posterior

$f\left ( \theta |y \right ) = \frac{f\left ( \theta \right )f\left ( y|\theta \right )}{f\left ( y \right )}.$

Suppose that we choose to perform inference on θ by constructing a point estimate$\hat{\theta }$(such as the posterior mean, mode or median) and a (1−α)-level interval estimateI = (L,U)(such as the CPDR or HPDR).

Then$\hat{\theta }$,I, LandUare functions of the datayand may be written$\hat{\theta }\left ( y \right )$,I(y), L(y)andU(y). Once these functions are defined, the estimates which they define stand on their own, so to speak, and may be studied from...

9. Sometimes we observe afunctionof the data rather than the data itself. In such cases the function typicallydegradesthe information available in some way. An example iscensoring, where we observe a value only if that value is less than some cut-off point (right censoring) or greater than some cut-off value (left censoring). It is also possible to have censoring on the left and right simultaneously. Another example isrounding, where we only observe values to the nearest multiple of 0.1, 1 or 5, etc.

Each light bulb of a certain type has a life which is conditionally...

10. In most of the Bayesian models so far examined, the calculations required could be done analytically. For example, the model given by:

(Y|θ) ~Binomial(5,θ)

θ ~U(0,1),

together with data y = 5, implies the posterior (θ|y) ~Beta(6,1). So θ has posterior pdff(θ|y)=6θ5and posterior cdfF(θ|y)=θ6. Then, settingF(θ|y)=1/2 yields the posterior median, θ = 1/21/6= 0.8909.

But what if the equationF(θ|y)=1/2 were not so easy to solve? In that case we could employ a number of strategies. One of these istrial and error, and another is via special functions in software packages,...

11. The termMonte Carlo(MC)methodsrefers to a broad collection of tools that are useful for approximating quantities based on artificially generated random samples. These include theMonte Carlo integration(for estimating an integral using such a sample), theinversion technique(for generating the required sample), andMarkov chain Monte Carlo methods(an advanced topic in Chapter 6). In principle, the approximation can be made as good as required simply by making the Monte Carlo sample size sufficiently large. As will be seen (further down), Monte Carlo methods are a very useful tool in Bayesian inference.

To illustrate the...

12. Monte Carlo methods were introduced in the last chapter. These included basic techniques for generating a random sample and methods for using such a sample to estimate quantities such as difficult integrals. This chapter will focus on advanced techniques for generating a random sample, in particular the class of techniques known asMarkov chain Monte Carlo(MCMC) methods. Applying an MCMC method involves designing a suitable Markov chain, generating a large sample from that chain for a burn-in period until stochastic convergence, and making appropriate use of the values following that burn-in period.

Like other iterative techniques such as the...

13. In the last chapter we introduced a set of very powerful tools for generating samples required for Bayesian Monte Carlo inference, namely Markov chain Monte Carlo (MCMC) methods. The topics we covered included the Metropolis algorithm, the Metropolis Hastings algorithm and the Gibbs sampler.

We now present one more topic, stochastic data augmentation, and provide some further exercises in MCMC. These exercises will illustrate how many statistical problem can be cast in the Bayesian framework and how easily inference can then proceed relative to the classical framework.

The examples below include simple linear regression, logistic regression (an example of generalised...

14. We have illustrated the usefulness of MCMC methods by applying them to a variety of statistical contexts. In each case, specialised R code was used to implement the chosen method. Writing such code is typically time consuming and requires a great deal of attention to details such as choosing suitable tuning constants in the Metropolis-Hastings algorithm.

A software package which can greatly assist with the application of MCMC methods is WinBUGS. This stands for:

Bayesian Inference Using Gibbs Sampling for Microsoft Windows.

The BUGS Project was started in 1989 by a team of statisticians in the UK (at the Medical...

15. In this chapter we will focus on the topic of Bayesian methods for finite population inference in the sample survey context. We have previously touched on this topic when considering posterior predictive inference of ‘future’ values in the context of the normal-normal-gamma model. The topic will now be treated more generally and systematically.

There are many and various ways in which Bayesian finite population inference can be categorised, for example:

situations with and without prior information being available

sampling with and without replacement

Monte Carlo based methods versus deterministic (or ‘exact’) methods

situations with and without auxiliary information being available...

16. Consider a finite population ofNvaluesy1,…,yNfrom the normal distribution with unknown mean μ and known variance σ2. Assume we have prior information about μ which may be expressed in terms of a normal distribution with mean μ0and variance$\sigma _{0}^{2}$.

Suppose that we are interested in the finite population mean, namely$\bar{y} = \left ( y_{1} + \cdots + y_{N}\right )/N$, and wish to perform inference on$\bar{y}$based on the observed values in a sample of sizentaken from this finite population via simple random sampling without replacement (SRSWOR).

For convenience, we will in what follows label (or rather relabel) then...

17. So far, in the context of Bayesian finite population models specified by:

f(ξ|y,θ) where ξis sorIorL(as discussed earlier)

f(y|θ) where

y= (ys,yr) = ((y1,…,yn), (yn+1,…,yN)) = (y1,…,yN)

f(θ) where θ = (θ1,…,θq),

we have been focusing primarily on two finite population quantities, the finite population totalyT=y1+…+yNand the finite population mean$\bar{y} = \left ( y_{1} + \cdots + y_{N}\right )/N = y_{T}/N$.

These are special cases of the class of linear combinations of the N population values

$\tilde{y} = c_{0} + cy_{1} + \cdots + c_{N}y_{N},$

for which inference is often straightforward, such as in the context of the general normal-normal-gamma finite population model.

We will...

18. We have already discussed the topic of ignorable and nonignorable sampling in the context of Bayesian finite population models. To be definite, let us now focus on the model defined by:

f(s|y,θ) (the probability of obtaining sample s for given values ofyand θ)

f(y|,θ) (the model density of the finite population vector)

f(θ) (the prior density of the parameter),

where the data isD= (s,ys) and the quantity of interest is some functional ψ =g(θ,y), e.g. a function of two components of θ or a function ofyonly, etc.

We say that the sampling mechanism...

19. Bibliography (pp. 677-679)