## Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

## If You Use a Screen Reader

This content is available through Read Online (Free) program, which relies on page scans. Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.

# The Effectiveness of Adjustment by Subclassification in Removing Bias in Observational Studies

W. G. Cochran
Biometrics
Vol. 24, No. 2 (Jun., 1968), pp. 295-313
DOI: 10.2307/2528036
Stable URL: http://www.jstor.org/stable/2528036
Page Count: 19
Preview not available

## Abstract

In some investigations, comparison of the means of a variate y in two study groups may be biased because y is related to a variable x whose distribution differs in the two groups. A frequently used device for trying to remove this bias is adjustment by subclassification. The range of x is divided into c subclasses. Weighted means of the subclass means of y are compared, using the same weights for each study group. The effectiveness of this procedure in removing bias depends on several factors, but for monotonic relations between y and x, an analytical approach suggests that for c = 2, 3, 4, 5, and 6 the percentages of bias removed are roughly 64%, 79%, 86%, 90%, and 92%, respectively. These figures should also serve as a guide when x is an ordered classification (e.g. none, slight, moderate, severe) that can be regarded as a grouping of an underlying continuous variable. The extent to which adjustment reduces the sampling error of the estimated difference between the y means is also examined. An interesting side result is that for x normal, the percentage reduction in the bias of $\bar x_2$-$\bar x_1$ due to adjustment equals the percentage reduction in its variance. Under a simple mathematical model, errors of measurement in x reduce the amount of bias removed to a fraction 1/(1 + h) of its value, where h is the ratio of the variance of the errors of measurement to the variance of the correct measurements. Since ordered classifications are often used because x is difficult to measure, h may be substantial in such cases, though more information is needed on the values of h that are typical in practice.

• 295
• 296
• 297
• 298
• 299
• 300
• 301
• 302
• 303
• 304
• 305
• 306
• 307
• 308
• 309
• 310
• 311
• 312
• 313