On the Out‐of‐Sample Predictability of Stock Market Returns*

Hui Guo  

Federal Reserve Bank of St. Louis

In this paper, I provide new evidence of the out‐of‐sample predictability of stock returns. In particular, I find that the consumption‐wealth ratio in conjunction with a measure of aggregate stock market volatility exhibits substantial out‐of‐sample forecasting power for excess stock market returns. Also, simple trading strategies based on the documented predictability generate returns of higher mean and lower volatility than the buy‐and‐hold strategy does, and this difference is economically important.

There is an ongoing debate about stock return predictability in time‐series data. Campbell (1987) and Fama and French (1989), among many others, find that macro variables such as the dividend yield, the default premium, the term premium, and the short‐term interest rate forecast excess stock market returns. However, Bossaerts and Hillion (1999), Ang and Bekaert (2001), and Goyal and Welch (2003) cast doubt on the in‐sample evidence documented by the early authors by showing that these variables have negligible out‐of‐sample predictive power.

In this paper, I provide new evidence of the out‐of‐sample predictability of stock returns. In particular, I find that the consumption‐wealth ratio (cay) by Lettau and Ludvigson (2001)—the error term from the cointegration relation among consumption, wealth, and labor income—exhibits substantial out‐of‐sample forecasting abilities for stock returns if augmented by a measure of aggregate stock market volatility ( ). More important, the improvement of the forecast model of cay augmented by over the model of cay by itself is statistically significant. My results reflect a classic omitted‐variable problem: While cay and are negatively related to one another, they are both positively correlated with future stock returns.1

For robustness, I also investigate whether one can use simple trading strategies to exploit the predictability documented in this paper. As suggested by Leitch and Tanner (1991), this evaluation criterion is potentially more sensible than the statistical counterpart. I consider two widely used and relatively naive portfolio strategies. First, following Breen, Glosten, and Jagannathan (1989), among others, one holds stocks if the predicted excess return is positive and hold bonds otherwise. In the second strategy, which has been used by Johannes, Polson, and Stroud (2002), among others, I allocate wealth between stocks and bonds according to the formula of the static capital asset pricing model (CAPM). I find that the managed portfolio generates higher mean returns with lower volatility than the market portfolio, and this difference is economically important. For example, the certainty equivalence calculation suggests that an investor would agree to pay annual fees of 2%–3% to hold the managed portfolio rather than the market portfolio over the period 1968:Q2–2002:Q4. Also, neither the CAPM nor the Fama and French (1993) three‐factor model can explain returns on the managed portfolio, and I reject the null hypothesis of no market timing ability using Cumby and Modest’s (1987) test. Moreover, my trading strategies require relatively infrequent rebalancing of portfolios, and therefore, these results are robust to the adjustment of reasonable transaction costs. Interestingly, consistent with Pesaran and Timmermann (1995), I find substantial variations in the profitability of trading strategies through time.

My results are in sharp contrast with those of Bossaerts and Hillion (1999), Ang and Bekaert (2001), and Goyal and Welch (2003), as mentioned above. This difference is explained by the fact that my forecasting variables drive out most variables used by the early authors, including the dividend yield, the default premium, and the term premium. There is one exception. The stochastically detrended risk‐free rate (rrel) used by Campbell, Lo, and MacKinlay (1997), among others, provides information beyond cay and about future stock returns in the in‐sample regression over the post–World War II period, although it becomes insignificant after 1980.2 I also find mixed evidence of its out‐of‐sample forecast performance.

My forecasting variables are motivated by those in the paper by Guo (2004), who shows that, in addition to the risk premium as stressed by standard models, investors also require a liquidity premium on stocks because of limited stock market participation. Therefore, and cay forecast stock returns because they proxy for the risk and liquidity premiums, respectively.3 Moreover, Guo shows that, although the two variables are both positively related to future stock returns, they could be negatively correlated with one another, as observed in the data.

The paper is organized as follows. I discuss the data in Section I and report the out‐of‐sample forecasting exercises in Section II. Some simple trading strategies are analyzed in Section III, and Section IV offers some concluding remarks.

I. Data

 

The consumption, net worth, labor income data, and the generated variable cay over the period 1952:Q2–2002:Q3 are obtained from Martin Lettau at New York University. I use the value‐weighted stock market return obtained from the Center for Research in Security Prices (CRSP) as a measure of market returns. The risk‐free rate obtained from CRSP is used to construct excess stock returns. As in Merton (1980) and many others, I construct the realized stock market variance, , using the daily stock market return data, which are obtained from Schwert (1990) before July 1962 and from CRSP thereafter. Following Campbell et al. (2001), I adjust downward the realized stock market variance for 1987:Q4, on which the 1987 stock market crash has confounding effects. The stochastically detrended risk‐free rate, rrel, is the difference between the nominal risk‐free rate and its last four‐quarter average.

Table 1, which includes the full sample and two subsample periods, presents summary statistics of excess stock market return, , and its forecasting variables used in this paper. It should be noted that the autocorrelation coefficients of the forecasting variables are less than 0.90 in both the full sample and the subsamples. There are some differences between the two subsamples. First, is more negatively related with in the second half (panel C) than the first half (panel B) of the sample. Second, while is negatively related to rrel in panel B, the two are slightly positively related in panel C. Third, excess stock market return, , is more negatively related with rrel in panel B than in panel C.

Table 1
Table 1 Summary Statistics

Open New Window

Figures 13 plot the forecasting variables through time. While (fig. 1) fell sharply, (fig. 2) rose dramatically during the second half of the 1990s. This pattern explains the strong negative relation between the two variables as reported in table 1. Also, rrel (fig. 3) fell steeply during the stock market “bubble” burst in 2001–2. As I show below, this episode weakens the forecasting ability of and rrel for stock market returns. However, the stock market correction in 2001–2 reinforces the forecasting ability of , which has been below its historical average since 1997. Nevertheless, my main results are not sensitive to whether I include these two years in the sample.

Fig. 1.— Consumption‐wealth ratio

Open New Window

Fig. 2.— Realized stock market variance

Open New Window

Fig. 3.— Stochastically detrended risk‐free rate

Open New Window

I first discuss the in‐sample regression results. As argued by Inoue and Kilian (2002), while out‐of‐sample tests are not necessarily more reliable than in‐sample tests, in‐sample tests are more powerful than out‐of‐sample tests, even asymptotically. Table 2 presents the ordinary least squares estimation results, with heteroskedasticity‐ and autocorrelation‐corrected t‐statistics in parentheses. It should be noted that I construct using the full sample, even in the subsample analysis.

Table 2
Table 2 Forecasting One‐Quarter‐Ahead Excess Stock Market Returns

Open New Window

Panel A is the full sample spanning from 1952:Q3 to 2002:Q4. Row 1 confirms the results by Lettau and Ludvigson (2001) that is a strong predictor of stock returns with the adjusted R2 of 8.2%. Row 2 shows that has negligible forecasting power for stock returns (row 2).4 However, becomes highly significant if is also included in the forecasting equation with the adjusted R2 of 14.7%, as shown in row 3. It should also be noted that, in the augmented model (row 3), the adjusted R2 and the point estimates of and are much higher than their counterparts in rows 1 and 2. These results reflect a classic omitted‐variable problem in rows 1 and 2: Although both and are positively related to future stock returns, they are negatively correlated with one another, as shown in table 1. Finally, row 4 shows that rrel provides additional information beyond and about future stock returns, and I find very similar results using two‐period‐lagged in row 5.5

I report the estimation results using two subsample periods (1952:Q3–1977:Q4 and 1978:Q1–2002:Q4) in panels B and C, respectively. In general, the results are very similar to those reported in panel A. For example, the forecasting ability improves substantially if I include both and in the forecasting equation, as shown in rows 8 and 13. It should also be noted that their point estimates are strikingly similar to their full‐sample counterparts in row 3, indicating a stable forecasting relation over time. This pattern explains their strong out‐of‐sample forecasting power presented in the next section. There are, however, some noticeable differences between the two subsamples. First, the predictability is substantially weaker in the second than in the first subsample. Second, while by itself is statistically significant in the first subsample (row 7), it is insignificant in the second subsample (row 12). Third, although rrel is statistically significant in the first subsample, it is insignificant in the second subsample. However, the two latter results are sensitive to the inclusion of observations from 2001–2 for the reasons mentioned above.

II. Out‐of‐Sample Forecasts

 

This section presents the analysis of the out‐of‐sample performance of various forecast models. I consider two cases. First, investors are assumed to know the cointegration parameters of cay, which I estimate using the full sample. They also observe consumption, labor income, and net worth without delay. This scenario is consistent with rational expectations models, in which agents have full information about the economy.6 Second, I estimate recursively the cointegration parameters using only information available at the time of the forecast. Moreover, I lag cay twice, given that consumption and labor income data are available with a one‐quarter delay. This scenario has appeal to practitioners, who must rely on the real‐time data.7

Figure 4 plots the recursively estimated coefficients on labor income (solid line) and net worth (dashed line). As in Lettau and Ludvigson (2001), I estimate the cointegration parameters using dynamic least squares with eight leads and lags. The point estimates show large variations until the 1990s because a relatively large number of observations are required to consistently estimate the cointegration parameters. Therefore, it should not be a surprise that the forecasting ability of cay deteriorates significantly if the cointegration parameters are estimated recursively relative to the fixed parameters using the full sample, especially during the early period. It should also be noted that the test in the second scenario is likely to be more stringent than investors would encounter in real time, given that investors may have fairly accurate estimates of the cointegration parameters. With these caveats in mind, I report the out‐of‐sample forecast exercises below.

Fig. 4.— Parameters of labor income (solid line) and net worth (dashed line)

Open New Window

A. Fixed Cointegration Parameters

Table 3 reports the out‐of‐sample regression results using the fixed cointegration parameters obtained from the full sample. I analyze four forecast models, including (1) a benchmark model of constant excess returns, (2) the model using only , (3) the model of augmented by , and (4) the model of augmented by and rrel. Throughout the paper, I denote the model of augmented by , which is the main focus of the analysis, as augmented . I report five commonly used forecast evaluation statistics: (1) the root mean squared error (RMSE); (2) the mean of absolute error (MAE); (3) the correlation between the forecast and the actual value (CORR); (4) the percentage of times when the forecast and the actual value have the same signs (sign); and (5) pseudo R2, one minus the ratio of the mean squared error from a forecast model to the benchmark model of constant excess returns. I highlight the best forecast model for each criterion by an asterisk.

Table 3
Table 3 Out‐of‐Sample Forecast: Fixed Parameters

Open New Window

Panel A is the sample from 1968:Q2 to 2002:Q4, which is similar to the sample analyzed by Lettau and Ludvigson (2001). In the out‐of‐sample forecast, I first run an in‐sample regression using data from 1952:Q2 to 1968:Q1 and make a forecast for 1968:Q2. Then I update the sample to 1968:Q2 and make a forecast for 1968:Q3 and so forth. Consistent with Lettau and Ludvigson, (col. 2) exhibits some out‐of‐sample forecasting power; for example, it has a smaller RMSE than the benchmark model of constant returns (col. 1). Consistent with the in‐sample regression results in table 2, its forecasting power improves dramatically by all the criteria if is added to the forecasting equation (col. 3). Adding rrel to augmented (col. 4), however, does not provide discernible improvement for the forecast performance: Overall, augmented by has the best out‐of‐sample performance.

Panel B is the subsample from 1976:Q1 to 2002:Q4. Consistent with Brennan and Xia (2002), (col. 2) has a larger RMSE than the benchmark model of constant returns (col. 1) over this period. However, this result is completely reversed if I augment with (col. 3): Again, augmented beats the other models by all criteria.

To check the robustness of the results, figure 5 plots the recursive RMSE ratio of augmented (col. 3 of table 3) to the benchmark model of constant returns (col. 1; solid line) and to the model of by itself (col. 2; dashed line) through time. The horizontal axis denotes the starting forecast date. For example, the value corresponding to June 1968 is the RMSE ratio over the forecast period from 1968:Q2 to 2002:Q4. I choose the range 1968:Q2–1996:Q4 for the starting forecast date; therefore, the out‐of‐sample test utilizes at least 25 observations. The two ratios are always smaller than one in figure 5, indicating that (1) adding to the forecasting equation substantially improves the forecasting ability of , and (2) augmented has substantial out‐of‐sample predictive power. In contrast, the model of by itself does not always outperform the benchmark model of constant returns since the solid line is above the dashed line over various periods.

Fig. 5.— RMSE ratio of augmented cay to benchmark model (solid line) and to model of cay (dashed line): fixed parameters.

Open New Window

B. Recursively Estimated Cointegration Parameters

Table 4 reports the out‐of‐sample performance using recursively estimated . The exercise is the same as the case of the fixed parameters except that the cointegration parameters are estimated recursively using only information available at the time of forecast. It should be noted that consumption, labor income, and net worth are available with a one‐quarter delay. For example, I first estimate the cointegration relation among consumption, net worth, and labor income and obtain the fitted using data from 1952:Q2 to 1967:Q4. Then I run an in‐sample forecasting regression using data from 1952:Q2 to 1968:Q1 ( is lagged two periods) and make a forecast for 1968:Q2. Then I update the sample to 1968:Q2 and make a forecast for 1968:Q3 and so forth. In general, the results are consistent with those in table 3. However, the forecasting ability of all models is substantially weaker in table 4 than in table 3, as expected.

Table 4
Table 4 Out‐of‐Sample Forecast: Recursive Parameters

Open New Window

In particular, for the period from 1968:Q2 to 2002:Q4, the augmented model of (col. 3) performs better than the benchmark model (col. 1) and the model of by itself (col. 2). Interestingly, inclusion of rrel (col. 4) improves the forecasting performance of augmented : Overall, it has the best forecasting performance among all four models.8 For the period 1976:Q1–2002:Q4, the benchmark model of constant returns has the smallest RMSE. Figure 6 plots the recursive RMSE ratio of augmented (col. 3 of table 4) to the benchmark model of constant returns (col. 1; solid line) and to the model of by itself (col. 2; dashed line) through time. The solid line remains below one after 1990, when the recursively estimated cointegration parameters become relatively stable, as shown in figure 4. Therefore, the poor out‐of‐sample performance of augmented is mainly attributed to the large estimation errors in the cointegration parameters. Moreover, the dashed line is always below one, indicating that adding to the forecasting equation substantially improves the forecasting ability of . It should also be noted that the solid line is always above the dashed line, indicating that the model of by itself has negligible out‐of‐sample predictive power if the cointegration parameters are estimated recursively.

Fig. 6.— RMSE ratio of augmented cay to benchmark model (solid line) and to model of cay (dashed line): recursive parameters.

Open New Window

C. Testing Nested Forecast Models

In this subsection, I provide two formal out‐of‐sample tests for nested forecast models. The first is the encompassing test ENC‐NEW proposed by Clark and McCracken (1999). It tests the null hypothesis that the benchmark model incorporates all the information about the next quarter’s excess stock market return against the alternative hypothesis that past variance provides additional information. The second is the equal forecast accuracy test MSE‐F developed by McCracken (1999). Its null hypothesis is that the benchmark model has a mean squared forecasting error less than or equal to that of the model augmented by past return; the alternative is that the augmented model has a smaller mean squared forecasting error. These two tests have also been used in Lettau and Ludvigson (2001), and Clark and McCracken (1999) find that they have the best overall power and size properties among a variety of tests proposed in the literature.

Table 5 presents the results of the out‐of‐sample tests. In panel A, I estimate the cointegration parameters for using the full sample, and the macro variables are available without delay. I focus on two pairs of nested forecast models: the benchmark model of constant stock returns versus augmented (row 1) and the model of by itself versus augmented (row 2). Again, I use observations from the period 1952:Q4–1968:Q1 for the initial in‐sample estimation and form the out‐of‐sample forecast recursively. Columns 2 and 4 report the asymptotic 95% critical value provided by Clark and McCracken (1999). I find that, in both tests, augmented outperforms the model of constant returns and the model of by itself at any conventional significant levels. In panel B, the cointegration parameters are estimated recursively, and the macro variables are available with a one‐quarter lag. Again, I find evidence that augmented outperforms the two competing models at the conventional significance level with only one exception: the MSE‐F test shows that the difference between augmented and the benchmark model of constant returns is not statistically significant.

Table 5
Table 5 One‐Quarter‐Ahead Forecasts of Excess Stock Market Returns: Nested Comparisons

Open New Window

III. Economic Values of Market Timing

 

Leitch and Tanner (1991) argue that the forecast models chosen according to statistical criteria are not necessarily the models that are profitable in timing the market. To address this issue, I investigate whether the documented predictability can be exploited to generate returns of higher mean and lower volatility than a buy‐and‐hold strategy offers. To conserve space, I report only the case of recursively estimated cointegration parameters, which is relevant to practitioners. Nevertheless, I find very similar results using the fixed cointegration parameters, which are available on request.

A. Switching Strategies

I adopt two widely used and relatively naive market timing strategies. The first strategy, which has been utilized by Breen and et al. (1989) and Pesaran and Timmermann (1995), among many others, requires holding stocks if the predicted excess return is positive and holding bonds otherwise. Table 6 reports the results of four trading strategies: a benchmark of buy‐and‐hold and three strategies based on the forecast models analyzed in tables 3 and 4. I present the mean, the standard deviation (SD), the ratio of the mean to the standard deviation (mean/SD), and the adjusted Sharpe ratio for the annualized returns on these portfolios.9

Table 6
Table 6 Switching Strategies with No Transaction Costs

Open New Window

Over the period 1968:Q2–2002:Q4, all managed portfolios have returns of higher mean and lower standard deviation than those of the buy‐and‐hold strategy. For example, the managed portfolio based on the forecast model of (col. 2) generates an average annual return of 13.7% with a standard deviation of 14.2%, compared with 11.3% and 18.0% respectively, for the buy‐and‐hold strategy (col. 1). And the adjusted Sharpe ratio of the managed portfolio is about 120% higher than the market portfolio. Therefore, even though the out‐of‐sample forecasting ability of is statistically negligible as shown in table 4, it is economically important. My results thus confirm Leitch and Tanner’s (1991) skepticism of using statistical criteria such as RMSE for forecast evaluation. Also, in contrast with the results of table 4, the model augmented with and rrel (col. 4) has an adjusted Sharpe ratio lower than the model that uses only. This is also true for the model augmented with (col. 3). As I show below, these results reflect the fact that information is not used efficiently in the switching strategy.

I find very similar patterns in the three subsample periods, which are reported in panels B–D of table 6. However, the performance of the managed portfolio relative to the benchmark fluctuates widely over time, which is consistent with the finding of Pesaran and Timmermann (1995). For example, for the market timing strategy based on only, one observes the biggest improvement in the 1970s: The managed portfolio has an adjusted Sharpe ratio of 0.48, compared with 0.08 for the market portfolio. In contrast, the managed portfolio has an adjusted Sharpe ratio of 0.67 (0.76) for the period 1980:Q1–1989:Q4 (1990:Q1–2002:Q4), compared with 0.48 (0.34) for the market portfolio. I find a similar pattern for the other forecast models.

Figure 7 provides some details of the strategy based on augmented (col. 3 of table 6). The upper panel plots the weight of stocks in the managed portfolio, which assumes two values of zero (100% of bonds) and one (100% of stocks). Interestingly, investors did not have to rebalance the portfolio very often, especially during the stock market run‐up in the 1980s and 1990s. The lower panel shows that, by using our forecasting variables to time the market, investors avoid some large downward movements in the stock market, for example, around the 1973 oil shock. Finally, the middle panel plots the value of a $100 investment in a market index (dashed line) and in the managed portfolio (solid line), respectively, starting from 1968:Q2. The latter is always higher than the former. By the end of 2002:Q4, the managed portfolio is worth $5,338, compared with $2,793 for the buy‐and‐hold strategy.

Fig. 7.— Switching strategies. a, Weight of stocks in managed portfolio. b, Values of managed portfolio (solid line) vs. market portfolio (dashed line). c, Returns on managed portfolio (solid line) vs. market portfolio (dashed line).

Open New Window

Table 7 investigates the effect of a proportional transaction cost of 25 basis points. For example, when investors switch from stocks to bonds or vice versa, they have to pay a fee of 0.25% of the value of their portfolios. It should be noted that a 25‐basis‐point fee is in the upper range of transaction costs for the market index (e.g., Balduzzi and Lynch 1999). In a comparison with the results in table 6, I find that transaction costs have a small impact on the performance of the managed portfolio. This result should not be a surprise because investors did not rebalance the managed portfolio very often, as shown in figure 7.

Table 7
Table 7 Switching Strategies with Transaction Costs

Open New Window

B. Choosing Optimal Portfolio Weights

In the second strategy, which has been adopted by Johannes et al. (2002), among others, I allocate wealth among stocks and bonds using the static CAPM. Specifically, I invest a fraction of total wealth, in stocks and a fraction in bonds, where γ is a measure of the investor’s relative risk aversion, is the predicted value from the excess return forecasting regression, and is the conditional variance measured by the fitted value from a regression of realized variance, , on a constant and its two lags. Compared with the first strategy, this strategy is plausible because it incorporates the information of not only signs but also the magnitude of the predicted excess return normalized by its variance. For simplicity, I ignore the estimation uncertainty, on which Johannes et al. offer some detailed discussion. I also assume that ωt is in the range [0, 1] or that investors are not allowed to short sell stocks or borrow from bond markets because those transactions might be infeasible in practice owing to high costs. It should be noted that the profitability of timing strategies should in principle be lower under these assumptions than otherwise because they reduce the set of investment opportunities and lead to a lower mean‐variance frontier.

Table 8 reports the statistics for returns on the managed portfolio based on various forecast models. In the calculation of the optimal weight for stocks, I assume that .10 As expected, the portfolio based on augmented (col. 3) has substantially higher Sharpe ratios than those reported in table 6 for the switching strategy. For example, over the period 1968:Q2–2002:Q4, the Sharpe ratio is 0.59 if investors choose portfolio weight optimally, compared with 0.45 for the switching strategy. Nevertheless, the other results are very similar to those reported in table 6. For example, market timing strategies based on models using as a forecasting variable generate returns of higher mean and lower volatility than the buy‐and‐hold strategy. Also, the relative performance of market timing strategies fluctuates widely over time and is the most effective in the 1970s.

Table 8.
Table 8. Choosing Optimal Portfolio Weights with No Transaction Costs

Open New Window

Figure 8 provides some details of the market timing strategy based on augmented (col. 3 of table 7). Again, the upper panel plots the weight of stocks in the managed portfolio, which is very similar to that of figure 7 except that the weight occasionally takes a value between zero and one. The lower panel plots the return on the managed portfolio (solid line) as well as the market return (dashed line). Compared with the first strategy plotted in figure 7, the second strategy successfully avoids additional major downward movements in the stock market. The middle panel shows that a $100 initial investment in the managed portfolio grows to $7,227 by the end of year 2002, which is over 2.5 times as much as the market portfolio. Again, table 9 shows that transaction costs have small effects on the performance of the managed portfolio.

Fig. 8.— Choosing optimal portfolio weights. a, Weight of stocks in managed portfolio. b, Values of managed portfolio (solid line) vs. market portfolio (dashed line). c, Returns on managed portfolio (solid line) vs. market portfolio (dashed line).

Open New Window

Table 9.
Table 9. Choosing Optimal Portfolio Weights with Transaction Costs

Open New Window

C. Some Further Tests

Cumby and Modest (1987) propose a formal test of market timing ability by regressing the realized excess return, , on a constant and an indicator variable, , which is equal to one if is expected to be positive and is equal to zero otherwise, as in the following equation: Under the null hypothesis of no market timing ability, the coefficient of the indicator variable, b, should not be statistically different from zero. Table 10 reports the regression results. Over the period 1968:Q2–2002:Q4, the null hypothesis of no market timing ability is rejected for all the forecast models.

Table 10
Table 10 Cumby and Modest (1987) Market Timing Ability Test: 1968:Q2–2002:Q4

Open New Window

I also investigate whether the CAPM and the Fama‐French model can explain returns on the managed portfolio. For the CAPM, I run regressions of excess returns on the managed portfolio, , on a constant and a single factor of excess stock market returns, as in equation (2). I include two additional factors: the return on a portfolio that is long in small stocks and short in large stocks (SMB) and the return on a portfolio that is long in high book‐to‐market stocks and short in low book‐to‐market stocks (HML) for the Fama‐French model:11 Under the joint null hypothesis that (1) the CAPM or the Fama‐French model is the correct model and (2) the managed portfolio is rationally priced, the constant term, α, should not be statistically different from zero. I report the regression results in table 11. Panels A and B are the cases of no transaction costs. For both strategies, the CAPM cannot explain returns on the managed portfolio over the period 1968:Q2–2002:Q4. The Fama‐French model explains the returns somewhat better; however, α is still significant for augmented by and rrel (col. 3), is marginally significant for augmented by (col. 2) in panel B, and is marginally significant for by itself (col. 1) in panel A. Again, I find essentially the same results if I incorporate a proportional transaction cost of 25 basis points in panels C and D.

Table 11
Table 11 Jensen’s 𝛂 Tests: 1968:Q2–2002:Q4

Open New Window

Finally, I calculate the certainty equivalence gain of holding the managed portfolio, as in Fleming, Kirby, and Ostdiek (2001). I assume that the utility function has the form where is initial wealth and is the return on the agent’s portfolio. The certainty equivalence gain, Δ, is defined in equation (4) as the fee that an investor would pay in exchange for holding the managed portfolio that pays a rate of return ; otherwise, he holds the market portfolio that pays :

Table 12 shows that the certainty equivalent gain of holding the managed portfolio is quite substantial, usually ranging from 2% to 3%. Moreover, transaction costs have a small effect on the results.

Table 12
Table 12 Certainty Equivalence Gains from Holding Managed Portfolio: 1968:Q2–2002:Q4

Open New Window

IV. Conclusion

 

In this paper, I show that the out‐of‐sample predictability of stock market returns is both statistically and economically significant. More important, in sharp contrast to early empirical work, I find that, in conjunction with the consumption‐wealth ratio, stock market volatility has strong forecasting power for stock market returns—a key implication of the CAPM. My results thus suggest that stock return predictability is not inconsistent with rational pricing, a point that has been emphasized by Campbell and Cochrane (1999) and Guo (2004), among others.

I also want to stress that the forecasting ability of the consumption‐wealth ratio is well motivated: It reflects a liquidity premium due to limited stock market participation, as in Guo (2004). In particular, it helps explain why the early authors failed to find significant forecasting power of volatility for stock returns: The risk and liquidity premiums are negatively related in the post–World War II sample. It also sheds light on the puzzling negative risk‐return relation documented in the early literature: Guo (2002b) shows that market risk is indeed positively priced if one controls for the liquidity premium.

It is important to notice that evidence that the CAPM and the Fama‐French model cannot explain the return on the managed portfolio does not necessarily pose a challenge to rational asset pricing theories. The reason is that, as shown by Merton (1973) and Campbell (1993), among others, a hedge for investment opportunity changes is also an important determinant of expected asset return, in addition to market risk. Using the same forecasting variables as in this paper, Guo (2002a) shows that Campbell’s (1993) intertemporal CAPM is quite successful in explaining the cross section of stock returns, including the momentum profit, which also challenges the CAPM and the Fama‐French model.12

Overall, stock return predictability documented in this paper has important implications for asset pricing and portfolio management and warrants attention in future research.

Appendix
cay versus tay

 

Brennan and Xia (2002) suggest that the predictive power of cay is spurious because, if calendar time is used in place of consumption, the resulting cointegration error, tay—an inanimate variable—performs as well as or better than cay. In their reply, Lettau and Ludvigson (2002) argue that, given that 99% of variations of consumption are explained by a time trend, tay, a seemingly inanimate variable, has more economic content than it appears. However, I still need to show that cay performs at least as well as tay, which I discuss in this appendix.

Table A1 presents the in‐sample regression results using the full sample from 1952:Q3 to 2002:Q4. Consistent with Brennan and Xia, row 1 shows that becomes statistically insignificant at the 5% level if tay is also included in the forecasting equation. However, this result is dramatically reversed if one adds to the forecasting equation: drives out tay in row 2. I find the same results if I also add rrel to the forecasting equation (row 3) or use two‐period‐lagged and tay (row 4).

Table A1
Table A1 Forecasting One‐Quarter‐Ahead Excess Stock Market Returns: cay versus tay

Open New Window

I also repeat the exercises of Sections II and III using tay in place of cay. Consistent with the in‐sample regression results, I find that cay always outperforms tay in the out‐of‐sample tests if augmented with . To conserve space, these results are not reported here but are available on request. Therefore, although the results by Brennan and Xia are interesting because they reflect an unstable relation between cay and excess stock market returns as a result of the omitted‐variable problem documented in this paper, they do not pose a challenge to the forecasting power of cay.

References

 
  • Ang, A., and G. Bekaert. 2001. Stock return predictability: Is it there? Unpublished working paper, Columbia University, Graduate School of Business.
  • Balduzzi, P., and A. Lynch. 1999. Transaction costs and predictability: Some utility cost calculations. Journal of Financial Economics 52:47–78.
  • Bernanke, B., and M. Gertler. 1989. Agency costs, net worth, and business fluctuations. American Economic Review 79:14–31.
  • Bossaerts, P., and P. Hillion. 1999. Implementing statistical criteria to select return forecasting models: What do we learn? Review of Financial Studies 12:405–28.
  • Breen, W., L. Glosten, and R. Jagannathan. 1989. Economic significance of predictable variations in stock index returns. Journal of Finance 44:1177–89.
  • Brennan, M., and Y. Xia. 2002. tay’s as good as cay. Unpublished working paper, University of California, Los Angeles, Anderson School of Business.
  • Campbell, J. 1987. Stock returns and the term structure. Journal of Financial Economics 18:373–99.
  • ———. 1993. Intertemporal asset pricing without consumption data. American Economic Review 83:487–512.
  • Campbell, J., and J. Cochrane. 1999. By force of habit: A consumption‐based explanation of aggregate stock market behavior. Journal of Political Economy 107:205–51.
  • Campbell, J., M. Lettau, B. Malkiel, and Y. Xu. 2001. Have individual stocks become more volatile? An empirical exploration of idiosyncratic risk. Journal of Finance 56:1–43.
  • Campbell, J., A. Lo, and C. MacKinlay. 1997. The econometrics of financial markets. Princeton, NJ: Princeton University Press.
  • Clark, T., and M. McCracken. 1999. Tests of equal forecast accuracy and encompassing for nested models. Unpublished working paper, Federal Reserve Bank of Kansas City.
  • Cumby, E., and D. Modest. 1987. Testing for market timing ability: A framework for evaluation. Journal of Financial Economics 19:169–89.
  • Fama, E., and K. French. 1989. Business conditions and expected returns on stocks and bonds. Journal of Financial Economics 25:23–49.
  • ———. 1993. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33:3–56.
  • Fleming, J., C. Kirby, and B. Ostdiek. 2001. The economic value of volatility timing. Journal of Finance 56:329–52.
  • Goyal, A., and I. Welch. 2003. Predicting the equity premium with dividend ratios. Management Science 49:639–54.
  • Graham, J., and C. Harvey. 1997. Grading the performance of market‐timing newsletters. Financial Analysts Journal 53:54–66.
  • Guo, H. 2002a. Time‐varying risk premia and the cross section of stock returns. Working Paper no. 2002‐013A, Federal Reserve Bank of St. Louis.
  • ———. 2002b. Understanding the risk‐return tradeoff in the stock market. Working Paper no. 2002‐001A, Federal Reserve Bank of St. Louis.
  • ———. 2004. Limited stock market participation and asset prices in a dynamic economy. Journal of Financial and Quantitative Analysis 39:495–516.
  • Inoue, A., and L. Kilian. 2002. In‐sample or out‐of‐sample tests of predictability: Which one should we use? Unpublished working paper, University of Michigan, Department of Economics.
  • Johannes, M., N. Polson, and J. Stroud. 2002. Sequential optimal portfolio performance: Market and volatility timing. Unpublished working paper, University of Chicago, Graduate School of Business.
  • Leitch, G., and E. Tanner. 1991. Economic forecast evaluation: Profits versus conventional error measures. American Economic Review 81:580–90.
  • Lettau, M., and S. Ludvigson. 2001. Consumption, aggregate wealth, and expected stock returns. Journal of Finance 56, no. 3:815–49.
  • ———. 2002. tay’s as good as cay: Reply. Unpublished working paper, New York University, Department of Economics.
  • McCracken, M. 1999. Asymptotics for out‐of‐sample tests of causality. Unpublished working paper, Louisiana State University.
  • Merton, R. 1973. An intertemporal capital asset pricing model. Econometrica 41:867–87.
  • ———. 1980. On estimating the expected return on the market: An exploratory investigation. Journal of Financial Economics 8:323–61.
  • Patelis, A. 1997. Stock return predictability and the role of monetary policy. Journal of Finance 52:1951–72.
  • Pesaran, M., and A. Timmermann. 1995. Predictability of stock returns: Robustness and economic significance. Journal of Finance 50:1201–28.
  • Schwert, G. 1990. Indexes of stock prices from 1802 to 1987. Journal of Business 63:399–426.
  • * I want to thank Martin Lettau, Sydney Ludvigson, Mike Pakko, Albert Madansky (the editor), and an anonymous referee for very helpful suggestions. George Fortier provided excellent editorial support. The views expressed in this paper are those of the author and do not necessarily reflect the official positions of the Federal Reserve Bank of St. Louis or the Federal Reserve System. Contact the author at .

  • 1. Brennan and Xia (2002) argue that the forecasting power of cay is spurious because if calendar time is used in place of consumption, the resulting cointegration error, tay, performs as well as or better than cay in predicting stock returns. In the Appendix, I show that cay always drives out tay if one adds past stock market variance and the stochastically detrended risk‐free rate to the forecasting equation. Therefore, although the results by Brennan and Xia are interesting because they reflect an unstable relation between cay and excess stock market returns due to the omitted‐variable problem documented in this paper, they do not pose a challenge to the forecasting power of cay.

  • 2. The short‐term interest rate and stock prices fell dramatically in 2001–2. This episode has a large impact on the forecasting power of rrel: It is significant if these two years are excluded from the post‐1980 sample.

  • 3. Patelis (1997) suggests that variables such as rrel reflect the stance of monetary policies, which have state‐dependent effects on real economic activities through a credit channel (e.g., Bernanke and Gertler 1989).

  • 4. This result is sensitive to the observations of the last few years in the sample, during which rose steeply, as shown in fig. 2: It becomes statistically significant if we use only the data up to 2000.

  • 5. Adding other commonly used forecasting variables, e.g., the dividend yield, the default premium, and the term premium, does not improve the forecasting power. These results are available on request.

  • 6. The Bureau of Economic Analysis (BEA) releases consumption and labor income data with about a one‐month delay. Given that the BEA only processes but does not create data, it is possible, although unlikely, that practitioners in financial markets may obtain these data without delay. More important, cay is a proxy for the conditional stock market return, and practitioners may obtain similar information from alternative sources. That said, I find similar results using two‐period‐lagged cay.

  • 7. Because consumption, net worth, and labor income data are subject to revisions, my results, which utilize the current vintage data, are potentially different from those obtained using the real‐time data. While it is not clear whether the current vintage data are biased toward finding predictability, the real‐time issue is beyond the scope of this paper, and I leave it for future research.

  • 8. This result is in contrast with that in table 3, in which rrel provides negligible information beyond augmented cay. One possible explanation is that, given that recursively estimated cay is likely to have large measurement errors in the early period, rrel provides additional information in table 4 because it is closely related to “true” cay estimated using the full sample (as shown in table 1).

  • 9. As in Graham and Harvey (1997) and Johannes et al. (2002), I scale the return on the managed portfolio, e.g., through leverage, so that it has the same standard deviation as the stock market return. The scaled return is then used to calculate the Sharpe ratio in the usual way.

  • 10. The results are not sensitive to reasonable variations in γ.

  • 11. SMB and HML are obtained from Kenneth French at Dartmouth College.

  • 12. Although the Fama‐French model is intended to capture the hedge for investment opportunity changes, its choice of additional risk factors is admittedly ad hoc.

© 2006 by The University of Chicago. All rights reserved.