Access

You are not currently logged in.

Access JSTOR through your library or other institution:

login

Log in through your institution.

Journal Article

Classification and Regression Trees: A Powerful Yet Simple Technique for Ecological Data Analysis

Glenn De'ath and Katharina E. Fabricius
Ecology
Vol. 81, No. 11 (Nov., 2000), pp. 3178-3192
Published by: Wiley on behalf of the Ecological Society of America
DOI: 10.2307/177409
Stable URL: http://www.jstor.org/stable/177409
Page Count: 15
Were these topics helpful?
See something inaccurate? Let us know!

Select the topics that are inaccurate.

Cancel
  • Download ($42.00)
  • Subscribe ($19.50)
  • Add to My Lists
  • Cite this Item
Classification and Regression Trees: A Powerful Yet Simple Technique for Ecological Data Analysis
Preview not available

Abstract

Classification and regression trees are ideally suited for the analysis of complex ecological data. For such data, we require flexible and robust analytical methods, which can deal with nonlinear relationships, high-order interactions, and missing values. Despite such difficulties, the methods should be simple to understand and give easily interpretable results. Trees explain variation of a single response variable by repeatedly splitting the data into more homogeneous groups, using combinations of explanatory variables that may be categorical and/or numeric. Each group is characterized by a typical value of the response variable, the number of observations in the group, and the values of the explanatory variables that define it. The tree is represented graphically, and this aids exploration and understanding. Trees can be used for interactive exploration and for description and prediction of patterns and processes. Advantages of trees include: (1) the flexibility to handle a broad range of response types, including numeric, categorical, ratings, and survival data; (2) invariance to monotonic transformations of the explanatory variables; (3) ease and robustness of construction; (4) ease of interpretation; and (5) the ability to handle missing values in both response and explanatory variables. Thus, trees complement or represent an alternative to many traditional statistical techniques, including multiple regression, analysis of variance, logistic regression, log-linear models, linear discriminant analysis, and survival models. We use classification and regression trees to analyze survey data from the Australian central Great Barrier Reef, comprising abundances of soft coral taxa (Cnidaria: Octocorallia) and physical and spatial environmental information. Regression tree analyses showed that dense aggregations, typically formed by three taxa, were restricted to distinct habitat types, each of which was defined by combinations of 3-4 environmental variables. The habitat definitions were consistent with known experimental findings on the nutrition of these taxa. When used separately, physical and spatial variables were similarly strong predictors of abundances and lost little in comparison with their joint use. The spatial variables are thus effective surrogates for the physical variables in this extensive reef complex, where information on the physical environment is often not available. Finally, we compare the use of regression trees and linear models for the analysis of these data and show how linear models fail to find patterns uncovered by the trees.

Page Thumbnails

  • Thumbnail: Page 
3178
    3178
  • Thumbnail: Page 
3179
    3179
  • Thumbnail: Page 
3180
    3180
  • Thumbnail: Page 
3181
    3181
  • Thumbnail: Page 
3182
    3182
  • Thumbnail: Page 
3183
    3183
  • Thumbnail: Page 
3184
    3184
  • Thumbnail: Page 
3185
    3185
  • Thumbnail: Page 
3186
    3186
  • Thumbnail: Page 
3187
    3187
  • Thumbnail: Page 
3188
    3188
  • Thumbnail: Page 
3189
    3189
  • Thumbnail: Page 
3190
    3190
  • Thumbnail: Page 
3191
    3191
  • Thumbnail: Page 
3192
    3192
Part of Sustainability