Simba's backtesting spreadsheet

 describes a spreadsheet originally developed by forum member Simba for the purpose of acting as a reference for historical returns, and analyzing a portfolio based on such historical data. The spreadsheet is no longer maintained by Simba, but other forum members continue to support it and to expand functionality, as a Bogleheads community project.

Backtesting
Backtesting is a term used in oceanography, meteorology and the financial industry to refer to testing a predictive model using existing historic data. It is also often used to analyze the past for research purposes. It is notably useful to quantitatively assess the impact asset allocation had (in known history) on possible investment strategies.

Backtesting seeks to estimate the performance of a strategy if it had been employed during a past period. This requires simulating past conditions with sufficient detail, making one limitation of backtesting the need for detailed historical data. A second limitation is the inability to model strategies that would affect historic prices.

Finally, backtesting, like other modeling, is limited by potential overfitting. That is, it is often possible to find a strategy that would have worked well in the past, but will not work well in the future.

Despite these limitations, backtesting provides valuable information not available when models and strategies are tested on synthetic data.

Spreadsheet overview and download instructions
The spreadsheet is discussed in this.

The latest version and download instructions are in [ this post].

Detailed instructions, glossary and revision history can be found in the README tab. Here is a brief overview of the individual worksheets:


 * Analyze_Portfolio provides a simple way to customize the asset allocation of a single portfolio and study its historical performance over specific time periods. This worksheet compares a custom portfolio to a benchmark portfolio (e.g. 60/40) and provides charts and statistics to illustrate respective growth, drawdowns and more.
 * Compare_Portfolios allows to compare two sets of 5 different portfolios, with the ability to change the portfolios' composition, the time period being examined, and to check the impact on various metrics and charts.
 * Lazy_Portfolios compares 25 predefined portfolios and provides risk/return statistics and charts for all portfolios.
 * Portfolio_Math calculates portfolio returns for all combinations of interest and performs most of the 'under the hood' calculations.
 * Analyze_Series investigates annual returns for all selected data series. Various statistics are provided, as well as a correlation matrix and rolling returns over various time periods.
 * Data_Series provides a selection mechanism to choose the data series to be analyzed and combined in portfolios in the rest of the spreadsheet.
 * Raw_Data provides a dynamic splicing mechanism to combine raw data (e.g. funds, indices and synthetic models) into full data series, while applying expense ratio adjustments.
 * Data_Sources identifies the various data sources used in this spreadsheet. More details on a per data series basis can be found in Raw_Data.

Historical Returns
The spreadsheet provides an extensive set of historical returns for various types of passive index funds, plus a few active funds. Most funds are from Vanguard and all corresponding historical returns have been validated with Vanguard. All returns are expressed as total returns, i.e. including dividends. The perspective of a U.S. investor is assumed, with returns expressed in USD.

When actual fund returns are not available (e.g. early years), attempts were made to provide credible numbers to extend the fund's history, using returns from corresponding indices, and in some cases, using a synthetic model.

In general, fairly good quality data is available starting in 1927 for U.S. stocks and bonds. Most International returns are available starting in 1970. Most sector returns aren't available before 1985.

The Raw_Data worksheet is the main repository for such historical data. The Data_Series worksheet assembles a synthetic list of data series returns and can be easily copied in another spreadsheet, allowing to perform other types of backtesting analysis than the tools provided by the Simba spreadsheet.

Telltale chart
John Bogle stated: "'A telltale chart is devised simply by dividing the cumulative returns of one data series into another'' (the benchmark)." In the Simba spreadsheet, such a chart compares the trajectory of historical returns for the various portfolios of interest to a Telltale benchmark (itself a portfolio), in a relative manner.

Using Telltale charts can be very informative, truly 'telling the tale' of what happened over time to portfolio trajectories, illustrating return to the mean properties, or lack thereof.

It is interesting to observe that growth charts are actually a special form of telltale charts, and can be generated by the same tool.

A telltale/growth chart is provided in the Compare_Portfolios worksheet and can be used in multiple ways:


 * If the benchmark is defined as 100% cash, the telltale chart becomes a simple growth chart (i.e. showing the growth of an initial investment over time, in nominal terms).
 * If the benchmark is defined as 100% inflation, the telltale chart becomes an inflation-adjusted growth chart (i.e. showing the growth of an initial investment over time, in real terms).
 * If a more concrete benchmark is used (e.g. US Total Market, aka TSM), the telltale chart shows the relative growth over time of the portfolios of interest compared to the benchmark.

Typical examples of Telltale charts generated with the Simba spreadsheet can be found here: Telling Tales – 2017 update.

Compatibility
The spreadsheet is maintained using Microsoft Excel 2011, and provided in XLSX format.

It should work fairly well with recent versions of LibreOffice Calc, with some minor limitations (e.g. dynamic chart titles do not automatically update). Once the file has been downloaded from Google Drive, open the file and save it in the suggested "ODF" default format. Otherwise, the charts will not be preserved in the correct format.

There is limited compatibility with Google Sheets. The calculations and statistics appear to work well, while the charts do not work properly.

Inner Workings
As any complex spreadsheet, the inner workings of the Simba spreadsheet may not be quite obvious by just looking at formulas. The following series of blog articles document various aspects of how it actually works.


 * Simba backtesting spreadsheet: a layered structure, elaborating on how the spreadsheet is constructed, and providing an overview of its layered structure.
 * Simba backtesting spreadsheet: risk metrics, risk ratios: elaborating on how risk metrics (e.g. volatility, drawdowns, etc) and risk ratios (e.g. Sharpe, Sortino, etc) are computed.
 * Simba backtesting spreadsheet: advanced topics: elaborating on more advanced topics (e.g. unbalancing, safe withdrawal rate).
 * Simba backtesting spreadsheet: miscellaneous topics: addressing a few miscellaneous topics (e.g. end label on charts, compatibility issues, spreadsheet analytics).