Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

If you need an accessible version of this item please contact JSTOR User Support

Model Selection and the Principle of Minimum Description Length

Mark H. Hansen and Bin Yu
Journal of the American Statistical Association
Vol. 96, No. 454 (Jun., 2001), pp. 746-774
Stable URL: http://www.jstor.org/stable/2670311
Page Count: 29
  • Download ($14.00)
  • Cite this Item
If you need an accessible version of this item please contact JSTOR User Support
Model Selection and the Principle of Minimum Description Length
Preview not available

Abstract

This article reviews the principle of minimum description length (MDL) for problems of model selection. By viewing statistical modeling as a means of generating descriptions of observed data, the MDL framework discriminates between competing models based on the complexity of each description. This approach began with Kolmogorov's theory of algorithmic complexity, matured in the literature on information theory, and has recently received renewed attention within the statistics community. Here we review both the practical and the theoretical aspects of MDL as a tool for model selection, emphasizing the rich connections between information theory and statistics. At the boundary between these two disciplines we find many interesting interpretations of popular frequentist and Bayesian procedures. As we show, MDL provides an objective umbrella under which rather disparate approaches to statistical modeling can coexist and be compared. We illustrate the MDL principle by considering problems in regression, nonparametric curve estimation, cluster analysis, and time series analysis. Because model selection in linear regression is an extremely common problem that arises in many applications, we present detailed derivations of several MDL criteria in this context and discuss their properties through a number of examples. Our emphasis is on the practical application of MDL, and hence we make extensive use of real datasets. In writing this review, we tried to make the descriptive philosophy of MDL natural to a statistics audience by examining classical problems in model selection. In the engineering literature, however, MDL is being applied to ever more exotic modeling situations. As a principle for statistical modeling in general, one strength of MDL is that it can be intuitively extended to provide useful tools for new problems.

Page Thumbnails

  • Thumbnail: Page 
746
    746
  • Thumbnail: Page 
747
    747
  • Thumbnail: Page 
748
    748
  • Thumbnail: Page 
749
    749
  • Thumbnail: Page 
750
    750
  • Thumbnail: Page 
751
    751
  • Thumbnail: Page 
752
    752
  • Thumbnail: Page 
753
    753
  • Thumbnail: Page 
754
    754
  • Thumbnail: Page 
755
    755
  • Thumbnail: Page 
756
    756
  • Thumbnail: Page 
757
    757
  • Thumbnail: Page 
758
    758
  • Thumbnail: Page 
759
    759
  • Thumbnail: Page 
760
    760
  • Thumbnail: Page 
761
    761
  • Thumbnail: Page 
762
    762
  • Thumbnail: Page 
763
    763
  • Thumbnail: Page 
764
    764
  • Thumbnail: Page 
765
    765
  • Thumbnail: Page 
766
    766
  • Thumbnail: Page 
767
    767
  • Thumbnail: Page 
768
    768
  • Thumbnail: Page 
769
    769
  • Thumbnail: Page 
770
    770
  • Thumbnail: Page 
771
    771
  • Thumbnail: Page 
772
    772
  • Thumbnail: Page 
773
    773
  • Thumbnail: Page 
774
    774