US mutual fund performance studies

US mutual fund performance studies compares various methodologies to determine stock mutual fund performance. Additional studies examine bond fund performance.

Stock funds
In a 2011 paper, "Mutual Funds", authors Elton, Gruber and Blake provide a survey article in which they review the modern literature on the characteristics and performance of mutual funds. The following tables of are derived from the study. Studies include those which examine the performance of stock mutual funds (after expenses) and those which study the pre-expense performance of stock funds. Other studies examine stock fund managers' timing ability; additional studies explore the persistence of stock fund performance.

Performance
Early studies on stock fund performance (included in the "Mutual Fund Performance (post expenses)" table below) use benchmarks derived from single factor measures such as the Sharpe ratio Treynor's alpha, or Jensen's alpha, all based on the CAPM (Capital Asset Pricing Model). As many stock funds hold portfolios that differ from the overall market (for example, concentrating on mid cap or small cap stocks), later studies draw upon the development of multi-factor models, such as the Fama-French three factor model (market, size, and value) or a four factor model, created by adding momentum to the three Fama-French factors. A fifth factor, involving bond returns is also sometimes used.

Studies using mutual fund holdings data (included in the table "Using Holdings Data (Pre-Expenses)" below) attempt to determine manager skill.

Conclusion: Both sets of studies indicate that, before costs, managers, on average, possess stock selection skill, but actual performance is insufficient to overcome the costs of management.

Timing
Researchers have used two different ways to measure mutual fund managers' timing ability. One method uses return data and a second uses holdings data. In addition, two computational models, and their variants, commonly serve as the basis for calculations used in timing studies: the Treynor and Mazuy (1966) method, and the Henriksson (1984) and Henriksson and Merton (1981) method.

Early studies used returns data and found no evidence of successful timing. However subsequent studies have used fund holdings data to measure timing. Using holdings data, these studies find evidence of positive timing: "Bollen and Busse (2001) found significant positive timing using daily data and a time series regression, and Kaplan and Sensoy (2005) and Jiang, Yao and Yu (2007) find positive timing using holdings data. All of these studies measure timing by looking at changes in the sensitivity to a single index."

Other researchers, while in agreement with using fund holdings data, argue that funds need to be benchmarked using multi-factor models because changes in the sensitivity to the market often come about because of changes in sensitivity to other factors (for example, by adding smaller cap stocks or higher beta stocks to the fund portfolio). This approach is taken by Elton, Gruber and Blake (2011), using one-index, two-index (stocks/bonds), Fama-French, and sector rotation models, and by Ferson and Qian (2006). Both of these studies find evidence of successful timing when measuring sensitivity to a single index, but find no evidence of positive timing when using multiple factors as benchmarks.

Conclusion: Studies on timing using returns data show no evidence of positive timing. Using holdings data, funds show positive timing when measured by its sensitivity to a single index. Positive timing is not found when the funds are measured by sensitivity to multiple factors.

Persistence of fund manager performance
Academics judge outperformance by positive alpha from an appropriate multi-index model.

Almost all studies on persistence in mutual fund performance find that poor performance in one period predicts poor performance in subsequent periods. A common characteristic of these poor performing funds is high expenses.

Many researchers have found a positive alpha over subsequent periods when ranking is done by alpha or alpha over residual risk. These studies include: "Carhart (1997), who found when funds were ranked by alpha the top ranked group had positive alphas over the next five years; Busse and Irvine (2006), who found persistence and positive alphas using Bayesian estimates; Gruber (1996); Elton, Gruber and Blake (1996), Elton, Gruber and Blake (2011), and Cohen, Coval and Pastor (2005), all of which find persistence for the top-ranked group and that the top group has a positive alpha."

Studies questioning the effectiveness of using multi-index models for judging persistence include Chan, Dimmock and Lakonishok (2010),, Cremers, Petajisto and Zitzewitz (2010), and Elton, Gruber and Blake (1999).

Chan, Dimmock and Lakonishok (2010) finds that results can vary depending on how the Fama-French factors are defined. For example they find that the Russell 1000 growth index can have an alpha ranging from -1.66% to +1.08% depending on how they define the factors. Cremers, Petajisto and Zitzewitz (2010) find the S&P index has a positive alpha using the Fama-French three-factor model. Elton, Gruber and Blake (1999) describe how growth is a complex variable that perhaps is not being modeled properly.

Fama and French (2010) note that fund performance showed stronger evidence of positive performance persistence during the 1975 to 2002 period, and that such performance has waned over the 1984 to 2006 period. The study concludes that a prediction that most fund managers have sufficient skill to cover their costs is not supported by the data. They find that poor performers do not appear to be so by chance, while those funds that have done extremely well may have obtained these results by chance.

Bond funds
The four bond fund studies each develop and use a differing array of benchmarks to evaluate bond mutual funds.

Blake, Elton & Gruber (1994)
This study evaluates bond funds using three sets of benchmarks:
 * 1) A one-index model (either a general bond index or the submarket index that Morningstar identified as most like the bond fund).
 * 2) Two three-index models.
 * 3) A six- index model. The six indexes were based on the major types of securities held by the fund and included an intermediate government bond index, a long-term government bond index, an intermediate corporate bond index, a long-term corporate bond index, a high-yield bond index, and a mortgage bond index.

Elton, Gruber and Blake (1995)
This study employs both time series and cross sectional tests on bond pricing and developes a new six-index model of bond pricing. The six variables include:
 * 1) An aggregate index of stock returns;
 * 2) An aggregate index of bond returns;
 * 3) A measure of risk premium in the bond market (return on high yield bonds minus a government bond index);
 * 4) A series to represent option valuation (the return on mortgage bonds)
 * 5) Two variables to measure unanticipated changes in inflation and unanticipated changes in GNP.

Comer and Rodriguez (2006)
Comer and Rodriguez use a single model and test a six-index model:
 * 1) Single index model.
 * 2) The six indexes they employ include three corporate government maturity return indexes (1 to 5 years, 5 to 10 years, and beyond 10 years), the return on high-yield bonds, the return on mortgages, and the return on Treasury bills.

Chen, Ferson and Peters (2010)
Chen, Ferson and Peters chose indexes based on:
 * 1) The term structure of interest rates;
 * 2) Credit spreads;
 * 3) Liquidity spreads;
 * 4) Mortgage spreads;
 * 5) Exchange rates;
 * 6) A measure of dividend yield and equity volatility.

Conclusion: All models show consistent bond fund underperformance. These studies suggest that models of bond fund performance should include at least three indexes: a general bond index, a risk index (capturing high yield bond performance), and an index to measure option-like qualities (capturing mortgage-backed security performance).

Mutual fund investor performance
The following papers examine individual investor mutual fund performance in the US contributory pension program. These papers document numerous behavioral pitfalls identified by behavioral economics.