How much does it matter, considering scatter? Part 1 of many: Introduction

Discuss all general (i.e. non-personal) investing questions and issues, investing news, and theory.
Post Reply
User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Tue Jul 31, 2018 3:58 pm

Image

In the forum, it is often suggested that the Vanguard Total Stock Market Index Fund, VTSMX, is a better choice than VFINX, the Vanguard 500 Index Fund, because it includes small-caps (and more mid-caps) than VFINX. This image above is one way to look at the comparison.

When comparing funds, portfolios, or strategies, there are three questions that could be asked. 1) Which had higher performance? 2) Do we expect that outperformance to continue, and, if so, why? The third, which doesn't get asked enough, is 3) How big is that difference, compared to the uncertainty and of future fluctuations in portfolio value?

Monte Carlo simulations are often used to show the likely variability of our future retirement savings, in order to measure it against our future needs. I decided to use it for comparisons between two funds, or two portfolio strategies. In order to see how the past difference in performance measures up against the likely future uncertainty of the alternatives.

This kind of diagram still leaves us with the question of how much conviction we have that the differences we've seen in the past will persist in the future. In the chart above, the difference between 500 index and Total Stock is small compared to the spread of outcomes from either one. However, someone might say that even a small statistical edge is worth having. So this won't settle any debates. It doesn't tells us which way the needle will move. But it does suggest how much changing a portfolio moves the needle, relative to the natural spread of outcomes.

Here's what the chart is showing.
  • The green and red dots are the final outcomes of 500 Monte Carlo simulations of the possible performance of the two funds.
  • The performance is not based on the growth of a single investment in the fund. It is, instead, based on the outcome of making periodic fund purchases of $100/month over the time period shown. It thus is closer to the way most of us actually invest.
  • The green and red crosses mark the outcome based on the actual historical performance of the funds.
  • The red and green range lines show the 10% and 90% percentiles of those results. In this case, 80% of the simulations showed final values for VFINX between $72,000 and $182,000; 10% were below $72,000, 10% were above $182,000. The middle mark, $113K, is the median.
  • The basis for all of the data that went into the simulation is the set pf "monthly values" tabulated by PortfolioVisualizer, "Backtest Portfolio" tool.
  • The range of time shown is that for all available data in PortfolioVisualizer.
  • The vertical scale represents the final value in dollars.
  • The horizontal scale is the annualized standard deviation of the monthly returns of the plotted assets. It is a measure of how much fluctuation you would see in portfolio value.
A future posting will give more details about the I've used. But I will just say here that the method I used is conservative and almost certainly underestimates the range of variation. For comparison, here are the 10% and 90%-percentile outcomes for the S&P 500 and its predecessors, obtained by three methods.

1) My simulation: $72,000 - $182,000. Top = 2.5X bottom.
2) PortfolioVisualizer (extrapolated to 26 years): $59,501 - $311,076. Top = 5.3X bottom.
3) Actual historic, 26-year-periods starting 1870: $77,191 - 286,582. Top = 3.71X bottom.

Other notes.

1) I simply use the time period for which data is available for the investment being explored. This means that results for, say, a 26-year time period are not directly comparable in dollars to those for a 5-year time period. I did this for two reasons. The first is compounding affects everything equally, both the historical values and the simulation, while the semilog plot means that the effect of compounding is the same as the effect of a vertical zoom. A longer time period magnifies both the difference and the spread. Therefore, it doesn't interfere with the point of the exercise, which is to judge difference relative to spread. Second, I just couldn't come up with a way to "normalize" everything to the same time period that was easy to explain or to justify.

2) For the "baseline" comparison, I try to use Bogleheadish index funds and portfolios. I choose the baseline funds depending on how much data I have for the comparison fund; for example, Total Stock if it goes back as far as the comparison fund, but 500 Index if I need to go back farther.

3) For the most part, I try to compare funds or strategies in the context of a portfolio with both stocks and bonds, not in isolation.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
vineviz
Posts: 5391
Joined: Tue May 15, 2018 1:55 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by vineviz » Tue Jul 31, 2018 4:10 pm

nisiprius wrote:
Tue Jul 31, 2018 3:58 pm
When comparing funds, portfolios, or strategies, there are three questions that could be asked. 1) Which had higher performance? 2) Do we expect that outperformance to continue, and, if so, why? The third, which doesn't get asked enough, is 3) How big is that difference, compared to the uncertainty and of future fluctuations in portfolio value?
Thanks for doing this and for sharing the results.

I often fall into the trap on fixating about small difference in inputs EVEN THOUGH I know the variance in the outputs (i.e. realized portfolio returns) are huge, and the variance due to random chance dwarfs by orders of magnitude the variance we can control through asset allocation.

Some people will quibble over the validity of descriptive statistics, but I'm pretty sure I've run comparisons (I don't have the notes handy) in which I tested whether the difference in mean return for a 100% stock portfolio and 50% stock/50% bond portfolio were statistically significant only to find that they were NOT. Even using monthly returns back to 1930.

As investors I think we tend to fixate on the precision with which we can measure financial data to the point that we forget that our accuracy in hitting future return targets is like trying to consistently hit a target at 200 yards with a Brown Bess musket.
Last edited by vineviz on Tue Jul 31, 2018 5:15 pm, edited 1 time in total.
"Far more money has been lost by investors preparing for corrections than has been lost in corrections themselves." ~~ Peter Lynch

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Tue Jul 31, 2018 4:48 pm

Thanks you the kind words.
vineviz wrote:
Tue Jul 31, 2018 4:10 pm
...As investors I think we tend to fixate on the precision with which we can measure financial data to the point that we forget that our accuracy in hitting future return targets is like trying to consistently hit a target at 200 yards with a Brown Bess musket...
Indeed. Of course, much of this is the result of the investment industry's need for "unique selling propositions."
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

GAAP
Posts: 919
Joined: Fri Apr 08, 2016 12:41 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by GAAP » Tue Jul 31, 2018 5:01 pm

Wow! Nice Work!

These probably belong in the Wiki.

User avatar
onthecusp
Posts: 716
Joined: Mon Aug 29, 2016 3:25 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by onthecusp » Tue Jul 31, 2018 5:56 pm

I really like the visualization!

It fits my view that a lot of what I do does not matter much as long as I get the big questions right. It also reinforces my decision to tilt small however little it may help.

User avatar
David Jay
Posts: 7163
Joined: Mon Mar 30, 2015 5:54 am
Location: Michigan

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by David Jay » Tue Jul 31, 2018 6:02 pm

I’ve glanced through Parts 1-4. Fun! Scatter plots are an interesting presentation method.
Prediction is very difficult, especially about the future - Niels Bohr | To get the "risk premium", you really do have to take the risk - nisiprius

MJW
Posts: 724
Joined: Sun Jul 03, 2016 7:40 pm
Location: Pacific Northwest

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by MJW » Tue Jul 31, 2018 6:19 pm

Nisiprius - Thank you for your generosity in the time and effort that went into these series of posts. I was interested to see the results.

gwrvmd
Posts: 817
Joined: Wed Dec 02, 2009 8:34 pm
Location: Calabash NC

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by gwrvmd » Tue Jul 31, 2018 10:33 pm

Thank You Nisiprius.....You have done some great graph presentations on here but this is the BEST.....Gordon
Disciple of John Neff

User avatar
Noobvestor
Posts: 4767
Joined: Mon Aug 23, 2010 1:09 am
Contact:

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by Noobvestor » Tue Jul 31, 2018 10:59 pm

Have you done one yet comparing various (inter)national stock markets? Could be interesting.
"In the absence of clarity, diversification is the only logical strategy" -= Larry Swedroe

marcopolo
Posts: 2399
Joined: Sat Dec 03, 2016 10:22 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by marcopolo » Tue Jul 31, 2018 11:00 pm

interesting analysis. Perhaps the Mahalanobis Distance would be a useful metric to evaluate whether two securities/portfolios provide meaningfully different performance? I think it might sum up, in a single number, what is shown quite nicely in the plots you generated.
Once in a while you get shown the light, in the strangest of places if you look at it right.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Wed Aug 01, 2018 6:35 am

nisiprius wrote:
Tue Jul 31, 2018 3:58 pm
Here's what the chart is showing.
  • The green and red dots are the final outcomes of 500 Monte Carlo simulations of the possible performance of the two funds.
  • The performance is not based on the growth of a single investment in the fund. It is, instead, based on the outcome of making periodic fund purchases of $100/month over the time period shown. It thus is closer to the way most of us actually invest.
  • The green and red crosses mark the outcome based on the actual historical performance of the funds.
  • The red and green range lines show the 10% and 90% percentiles of those results. In this case, 80% of the simulations showed final values for VFINX between $72,000 and $182,000; 10% were below $72,000, 10% were above $182,000. The middle mark, $113K, is the median.
  • The basis for all of the data that went into the simulation is the set pf "monthly values" tabulated by PortfolioVisualizer, "Backtest Portfolio" tool.
  • The range of time shown is that for all available data in PortfolioVisualizer.
  • The vertical scale represents the final value in dollars.
  • The horizontal scale is the annualized standard deviation of the monthly returns of the plotted assets. It is a measure of how much fluctuation you would see in portfolio value.
Hm. I don't get it. Could you please elaborate on the meaning of the various green dots. What changes from one green dot to another? It seems that the time period is always the same (correct?). Then what is the varying factor?

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Wed Aug 01, 2018 7:06 am

marcopolo wrote:
Tue Jul 31, 2018 11:00 pm
interesting analysis. Perhaps the Mahalanobis Distance would be a useful metric to evaluate whether two securities/portfolios provide meaningfully different performance? I think it might sum up, in a single number, what is shown quite nicely in the plots you generated.
I learned something today: Mahalanobis Distance on Wikipedia.

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Wed Aug 01, 2018 8:33 am

siamond wrote:
Wed Aug 01, 2018 6:35 am
... Then what is the varying factor?...
Different Monte Carlo simulation runs covering the same time period.

I expect spirited discussion once I post more details of the methodology, but I am convinced that the method I am using is very conservative and understates the range of variation.

In the example shown, we have data for 26 years = 312 months. We want Monte Carlo simulations of other possible results over the same time period. Divide the time into 156 pairs of months. For each simulation run, simulate each monthly return by randomly choosing one or the other of the two months within the same month-pair.

That is, choose month 1 randomly from either month 1 or 2 of the original record; choose month 2 randomly from either month 1 or month 2; choose month 3 randomly from month 3 or 4 of the original; choose month 4 randomly from either month 3 or 4; and so on.

Notice, first, that there's a 50% chance of any simulated month being exactly the same as its historic equivalent. On the average, only half the months will be different. Second, notice that every simulated month is no more than one month different in time from the historical reality. Thus, we are not using statistics that include any 1930s data, for example. The simulation runs all crash in 2008-2009, etc. I don't believe this actually matters much, we could have gotten essentially the same pictures just by adding in a reasonable random variation or whatever other simulations do, but wanted to be sure I wasn't exaggerating the variability. I also chose to simulate "equal monthly $100 contributions," not growth of a single lump sum, because as long as I was doing tons of math anyway it was just as easy... and more realistic... and less frequently displayed.

Image
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
vineviz
Posts: 5391
Joined: Tue May 15, 2018 1:55 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by vineviz » Wed Aug 01, 2018 8:53 am

nisiprius wrote:
Wed Aug 01, 2018 8:33 am
That is, choose month 1 randomly from either month 1 or 2 of the original record; choose month 2 randomly from either month 1 or month 2; choose month 3 randomly from month 3 or 4 of the original; choose month 4 randomly from either month 3 or 4; and so on.
This seems like a very reasonable solution to the challenge of choosing your input assumptions for the simulations.
nisiprius wrote:
Wed Aug 01, 2018 8:33 am
also chose to simulate "equal monthly $100 contributions," not growth of a single lump sum, because as long as I was doing tons of math anyway it was just as easy... and more realistic... and less frequently displayed.
The differences between allocations in accumulation phase, which you've illustrated nicely, are (IIRC) almost always going to be smaller than the differences between allocations with an initial lumps investment.

I think it might be helpful (at the very least, interesting), if you're up for it, to illustrate few cases in which you plot both a lump sum and steady contribution for the same allocation pairs. I'm not sure how you'd scale the plots, though (maybe time weighted CAGR instead of final $$ balance).
"Far more money has been lost by investors preparing for corrections than has been lost in corrections themselves." ~~ Peter Lynch

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Wed Aug 01, 2018 5:45 pm

nisiprius wrote:
Wed Aug 01, 2018 8:33 am
siamond wrote:
Wed Aug 01, 2018 6:35 am
... Then what is the varying factor?...
Different Monte Carlo simulation runs covering the same time period.

I expect spirited discussion once I post more details of the methodology, but I am convinced that the method I am using is very conservative and understates the range of variation.

In the example shown, we have data for 26 years = 312 months. We want Monte Carlo simulations of other possible results over the same time period. Divide the time into 156 pairs of months. For each simulation run, simulate each monthly return by randomly choosing one or the other of the two months within the same month-pair.
Ok, got it, thanks for the explanation. Like any Monte-Carlo simulation, this is highly suspect, of course, as it doesn't show reality, it just shows an artificial model that may very well lose some hidden dynamics of the real-life assets being looked at (notably sequential effects in consecutive months, e.g. momentum and the likes; also the timing of dividends distribution; etc). Yeah, I know, I am not the first one to express skepticism at Monte-Carlo modeling... I agree you chose a model which appears to stay much closer to reality than many other MC models that have been used in the past. Still, I always shake my head at such 'let's reshuffle the returns' modeling attempts.

Another concern, and something that surprises me coming from the illustrious Nisiprius, is the fact that ALL simulations start and end on the exact same date. Now THAT is guaranteed is to totally distort the outcomes, as you've been pointing out on this forum for years.

Why not try to analyze periods of 10, or 15, or 20 years, and vary the starting/end date? Actually, using monthly returns, you could even do away with the Monte-Carlo stuff and assemble a reasonably meaningful scatter graph.

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Wed Aug 01, 2018 8:03 pm

They don't all start on the same date.

The principle I've tried to follow is that if I want to look at, say, small caps in my "comparison," I look for a mutual fund that a) is a reasonable exemplar of "small caps," that b) has the earliest possible date of inception, and then I use "all available data." (In some cases, "all available data" is limited by PortfolioVisualizer, which only goes back to 1985).

For the "baseline" portfolio components, I try to use plain vanilla, representative, Bogleheadish funds. For stocks, Total Stock if the "comparison" fund has inception after 1992, Vanguard 500 Index if it is older than that. For bonds, Total Bond if the comparison fund has inception after 1986, otherwise... I wing it.

Because my Monte Carlo simulation stays fairly close in time to the historic data, any unusual characteristics of the time period should show up in all of the things being compared.

And I try to make it clear that it's all a blunt instrument.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Wed Aug 01, 2018 10:19 pm

nisiprius wrote:
Wed Aug 01, 2018 8:03 pm
They don't all start on the same date.
Well, for a given comparison (a given scatter graph), it seems that all simulation runs (all points on the scatter graph) have the same start and end date, and the only varying factor is the reshuffling between month N and month N+1. Or did I misunderstand?

If that is the case, then we do have a troublesome case of sensitivity to the start/end date.
nisiprius wrote:
Wed Aug 01, 2018 8:03 pm
And I try to make it clear that it's all a blunt instrument.
Yes, yes, I understand and appreciate your efforts. You expected a spirited discussion, I am giving you some... :wink:

User avatar
UpsetRaptor
Posts: 393
Joined: Tue Jan 19, 2016 5:15 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by UpsetRaptor » Wed Aug 01, 2018 11:45 pm

Nisi, I didn't see a tl;dr summary in OP, so I'll just ask it: Which stock are you saying I should short right now?

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Thu Aug 02, 2018 4:58 am

siamond wrote:
Wed Aug 01, 2018 10:19 pm
nisiprius wrote:
Wed Aug 01, 2018 8:03 pm
They don't all start on the same date.
Well, for a given comparison (a given scatter graph), it seems that all simulation runs (all points on the scatter graph) have the same start and end date, and the only varying factor is the reshuffling between month N and month N+1. Or did I misunderstand?

If that is the case, then we do have a troublesome case of sensitivity to the start/end date.
When I was developing this, I actually did look at endpoint sensitivity. Unfortunately, the whole tools is a Rube Goldberg kludge involving many manual steps, so at the moment I can't just press a button to rerun an analysis with different end dates... and the charts I made were with a different version of the simulation and were just straight growth, etc. The short story is that specific numbers, and details like whether the comparison portfolio had a higher or lower median return than the baseline, are endpoint sensitive as always. However, the general features--the surprisingly (?) wide spread of outcomes, and the general relationship as to which things matter a lot and which things matter less, are pretty stable. Anyway, your question is on my to-do list, but it's going to have to be something limited--like a one-shot in which I take one of the longer comparisons, and perform it twice, once for the first half of the available data and once for the second half.

But, sure, I haven't done the analysis but I'm pretty sure that the addition of international stocks will look like it "matters" more if we look at either 2000-2009 (outperformance) or 2010-present (underperformance) than at a longer period of time including both.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Thu Aug 02, 2018 1:46 pm

nisiprius wrote:
Thu Aug 02, 2018 4:58 am
The short story is that specific numbers, and details like whether the comparison portfolio had a higher or lower median return than the baseline, are endpoint sensitive as always. However, the general features--the surprisingly (?) wide spread of outcomes, and the general relationship as to which things matter a lot and which things matter less, are pretty stable. Anyway, your question is on my to-do list, but it's going to have to be something limited--like a one-shot in which I take one of the longer comparisons, and perform it twice, once for the first half of the available data and once for the second half.
I never played much with scatter graphs, but I suspect that, if you start a new worksheet from scratch, it will turn out to be pretty straightforward to do what I suggested (a scatter graph of all possible rolling periods of X years that can fit in the overall time period under study, varying the starting month). No Monte-Carlo stuff, just historical reality. And if you do the math in real terms, it should be straightforward to include a regular investment pattern (accumulation) or withdrawals (distribution) for each rolling period.

And then maybe you'll find general features similar to your Monte-Carlo simulation, or maybe not. And maybe the similarity (or lack thereof) will depend on the asset type. Seems worth finding out.

User avatar
tfb
Posts: 8142
Joined: Mon Feb 19, 2007 5:46 pm
Contact:

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by tfb » Thu Aug 02, 2018 7:14 pm

nisiprius wrote:
Tue Jul 31, 2018 3:58 pm
The green and red dots are the final outcomes of 500 Monte Carlo simulations of the possible performance of the two funds.
Possibly I still don't quite understand the methodology. We have a cloud of green dots and a cloud of red dots representing a range of outcomes. We show they pretty much overlap. Whether we choose S&P 500 or TSM, it doesn't affect our range of outcomes very much. We also know S&P 500 and TSM highly correlate with each other. If we run into a poor market environment, our performance will be poor regardless. However, because we have no control over whether we will run into a poor market environment, shouldn't we limit the comparison to the difference between two funds or portfolios in the same period and run simulation over the difference? In other words, if we choose the performance in month N for portfolio A we also choose the performance in month N for portfolio B, repeat and compare the difference. If over 500 simulations we show portfolio A outperforms portfolio B 80% of the time with a range of -1% to +5% of the end value we say to ourselves portfolio A is probably a better choice, at least based on our simulation.
Harry Sit, taking a break from the forums.

heyyou
Posts: 3566
Joined: Tue Feb 20, 2007 4:58 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by heyyou » Thu Aug 02, 2018 7:27 pm

Consider using the differences to calculate how much extra savings per time unit would be needed by the person who chose the lower performing fund for the entire period. $100 per month would matter, but not $1-10 per month. My point being that both funds are very good for their intended purpose, it is the savings amount that matters far more than the fund selection that so many worry about so often.

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Thu Aug 02, 2018 7:50 pm

tfb wrote:
Thu Aug 02, 2018 7:14 pm
nisiprius wrote:
Tue Jul 31, 2018 3:58 pm
The green and red dots are the final outcomes of 500 Monte Carlo simulations of the possible performance of the two funds.
Possibly I still don't quite understand the methodology. We have a cloud of green dots and a cloud of red dots representing a range of outcomes. We show they pretty much overlap. Whether we choose S&P 500 or TSM, it doesn't affect our range of outcomes very much. We also know S&P 500 and TSM highly correlate with each other. If we run into a poor market environment, our performance will be poor regardless. However, because we have no control over whether we will run into a poor market environment, shouldn't we limit the comparison to the difference between two funds or portfolios in the same period and run simulation over the difference? In other words, if we choose the performance in month N for portfolio A we also choose the performance in month N for portfolio B, repeat and compare the difference. If over 500 simulations we show portfolio A outperforms portfolio B 80% of the time with a range of -1% to +5% of the end value we say to ourselves portfolio A is probably a better choice, at least based on our simulation.
I think your understanding of the methodology is correct. What we "should" analyze isn't an objective question, and, like risk tolerance, depends on personal values and attitudes.

For example, back when it took a $100,000 minimum to get Admiral shares, I realized that although I didn't have enough to get Admiral shares of either Total Stock or Total Bond, I did have enough to exchange $60,000 of Total Stock and $40,000 of Total Bond for $100,000 of Vanguard Balanced Index (which at that time was literally a 60/40 fund-of-two-funds). By doing this, I could cut my expense ratio by 0.10%, and thus increase my earnings by about $100/year initially. A little while after I did that, the stock market crashed. As a result of my being in VBIAX instead of the equivalent of VBINX, during the crash--multiply the Morningstar dollar numbers by ten--

Source
Image

--I only lost $34,471.00 instead of $34,531.80.

Now, there's just no away around the personal factor. Some people would say that $60.80 is $60.80, whether it's in the context of an extra gain or a reduced loss. To me, I will say that I don't think that $60 either way matters very much to me, in the face of a $30,000 variation.

So, that's what I chose to look at. Comparing the size of the differences resulting from portfolio choices to the natural range of variation that you see, both historically, but also in simulations as the logical consequence of a daily standard deviation of 1%.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
tfb
Posts: 8142
Joined: Mon Feb 19, 2007 5:46 pm
Contact:

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by tfb » Thu Aug 02, 2018 8:28 pm

nisiprius wrote:
Thu Aug 02, 2018 7:50 pm
Now, there's just no away around the personal factor. Some people would say that $60.80 is $60.80, whether it's in the context of an extra gain or a reduced loss. To me, I will say that I don't think that $60 either way matters very much to me, in the face of a $30,000 variation.
I would also say a difference of $60 doesn't matter to me, not because the market just made me lose $30,000 but because it represents less than 0.1% of the $65k end value. Even if over 500 simulations I see 100% win-rate with a difference in end value between +0.06% and +0.12% I would still put it in the "it doesn't matter" pile. On the other hand, if another choice shows a 80% win-rate with a range of -2% to +20%, I may be interested. Some portfolio pairs in your series may be of this latter type but the difference is hidden in the large cloud of uncertain outcomes. I think wanting some fund choices to influence the range of outcomes in the face of market uncertainty is just asking too much. Do people really think they can do that?
Harry Sit, taking a break from the forums.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Fri Aug 03, 2018 9:32 pm

Nisi, I checked your other posts (Part 2 to Part 8), and I continue to have severe doubts about the methodology. Sorry! :|

It seems to me that the bottomline is really simple. If the green cross and the orange cross are close to each other, then the corresponding scatters are closely meshed together, and we tend to infer that it doesn't matter. If the green cross and the orange cross are NOT close to each other, then the corresponding scatters are NOT closely meshed together, and we tend to infer that it matters.

In other words, I fear that those scatter graphs, although they look really cool and all, do not provide much more information than the two single primary data points. Which are, as previously discussed, highly dependent on the start/stop dates. I am ready to bet that just by playing a bit on the start/stop dates, we could very easily end up with "it matters" or "it doesn't matter" inferences, using the same data set.

I continue to think that your original "scatter vs. matter" goal would be much better served by varying the start/end date and solely using historical data, instead of keeping the start/end date constant and using this strange Monte-Carlo play on month N vs. month N+1.

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Sat Aug 04, 2018 7:36 am

Siamond, I perceive that you are raising three different issues here. I intend to write about them in some way in a while. No time for a details right now. And note that I say "write about them," not "answer them."

1) The chart doesn't show the right thing from the point of view of "competitive investing." In a sense this is both the most serious and the least serious of the three. Consider this chart from part 6 of my series, comparing S&P 500 index funds with low and high expense ratios.
Image
The visual impression is that the difference is small compared with the range of scatter. This is correct and accurate. However, it is also true that the difference is as close to absolute certainty as anything is in investing. In the simulations, assuming we pair identical random choices for the two funds, the low-cost fund will beat the high-cost fund 100% of the time. It has a 100% win percentage. An investor who viewed the chart believing that the two scatter plots were independent of each other might believe that the low cost fund only won, oh, I don't know--don't have the number--52% of the time. So a viewer could mistakenly infer something misleading, and that's why this is serious. But it's also the easiest to explain and justify, because I don't care about competitive investing.

2) You are concerned with endpoint sensitivity. That is, the relative position of the two swarms, and their centers, on the Y axis, is simply the return of the two portfolios over one specific time period. It happens to be "all available data in PortfolioVisualizer," and furthermore "all available data choosing mutual funds covering the longest possible period," but still one time period. Therefore, the concern as I understand it is that two portfolios that "really" have a difference that matters might falsely look almost identical because by chance the time period happened to be a time period over which the returns of the portfolios were tied. I'll be discussing that in a later post. The gist is that I'm not set up to do that it any big or comprehensive way, but a couple of "reality checks" suggest that the general pictures are reasonably stable if I take a plot, break it into two equal periods of time, and chart the two half-periods separately. You can see a preview of this, in effect, in part 7. An earlier posting gave the impression that a choice of bond funds in a 60/40 doesn't matter very much. In part 7, instead of using all available data, I compared two bond funds--Treasuries and corporates--over a "cherry-picked" very short period of time, namely 2007-2009 inclusive, that is to say the time period when you would expect the differences to be biggest (yes) and matter the most.

3) You suggest that it would be better and possible to replace my artificial Monte Carlo methodology with actual display of historical data over a set of randomly chosen endpoints (or something like that). My reply is that this gets us into familiar problem territory. In the post I'm working on explaining my methodology I show a plot based on 26-year periods of time since 1871, $100/month contributions, of the S&P 500 and predecessors (Shiller's data) to show that the amount of variation in endpoints over 26-year periods is similar to and greater than those for VFINX over a 26-year period with my methodoloyg. But the problem with any history-based analysis is that there isn't enough history and there are real concerns about fundamental changes over the time period. There are, for example, less than six non-overlapping 26-year periods since 1871. All data derived from rolling periods is pretty suspect. The situation is bad enough for "stocks" (Cowles data back to 1871) or "CRSP data" (back to 1926 but mostly not available at low cost). The bigger problem is that you just don't have real-world, running-real-money data--i.e. mutual funds--for all of the questions most of us are interested in--going very far at all.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Sat Aug 04, 2018 8:49 am

nisiprius wrote:
Sat Aug 04, 2018 7:36 am
Siamond, I perceive that you are raising three different issues here. I intend to write about them in some way in a while. No time for a details right now.
Thank you for your response. Just let me reply briefly, and hopefully this will provide some inputs for your full write-up.

Issue#1: I actually didn't raise it, but it is very valid. Overlapping scatter clouds can hide (if not downright obfuscate) a truly meaningful difference.

Issue#2: I am a little stunned that this doesn't bother you more than that. The example of domestic vs. international is full of misguided thinking where people justify themselves based on cherry-picked data. Scatter or not scatter, the issue remains in full force. And I suspect it is compounded by issue #4 (see below).

Issue #3: Yes, of course, I am well aware of the difficulties of using limited historical data. Using monthly returns as (varying) starting point can provide more diversity though, even if data is strongly overlapping (which is the case in your MC simulation too - no choice). In addition, using index returns could help staying reasonably real-life while improving time spans (and also consistency). Still, this is very imperfect for sure. I'll probably give it a try myself, it really shouldn't be hard to do, and you got me curious about the outcome! :wink:

But... my last post was raising a somewhat different issue. I have a strong suspicion that the shape of the scatter clouds is more a function of the modeling choice itself (month N/N+1) than something intrinsic to the dynamics of the asset classes being studied. Case in point, the cross is always at the center of the cloud, isn't it? You also seem to believe that the general shape of the clouds isn't that dependent on the time period, which makes little sense to me in the real world, making me question the root cause of such observation. Maybe I am biased (admittedly, the sheer MC concept goes under the skin of my brain!), but it seems to me that your posts #2 to #8 appear to corroborate my doubt, the scatter clouds really don't seem to provide true value-added per se. Hence issue #4: the scatter representation may not add much more *new* information to the simple information conveyed by the primary data point they are centered on.

And then the sophistication of the representation may become its own downfall (casual readers, impressed by a fancy graph, will probably make inferences which are not based on much reality).

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Sat Aug 04, 2018 6:57 pm

siamond wrote:
Sat Aug 04, 2018 8:49 am
...And then the sophistication of the representation may become its own downfall (casual readers, impressed by a fancy graph, will probably make inferences which are not based on much reality)...
Yes, I worry about that.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Sun Aug 05, 2018 12:40 am

nisiprius wrote:
Tue Jul 31, 2018 3:58 pm
[*]The performance is not based on the growth of a single investment in the fund. It is, instead, based on the outcome of making periodic fund purchases of $100/month over the time period shown. It thus is closer to the way most of us actually invest.
Are the $100 being added a constant (nominal) quantity, or adjusted for inflation? I assume the latter?

PS. I am pursuing the little project I mentioned, building scatter charts based on historical returns, while varying the start date. I think I have it working, but I need to run a few sanity checks first...

MathWizard
Posts: 3573
Joined: Tue Jul 26, 2011 1:35 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by MathWizard » Sun Aug 05, 2018 12:48 am

I like the TSM better than just the SP 500 because it captures the whole market. With similar ER, it is more diversified.

It's not that it gives a better return, it is that it reduces risk.

warner25
Posts: 391
Joined: Wed Oct 29, 2014 4:38 pm

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by warner25 » Sun Aug 05, 2018 1:37 pm

nisiprius, I just read the whole series and enjoyed it. I always enjoy the intellectual curiosity and honesty that you bring. I know you're trying to avoid concrete conclusions, but my takeaway was simply a reminder that there's a clear big-picture hierarchy of what really "matters." Would you agree with that?

#1 Having the means and discipline to actually save and invest
#2 An order-of-magnitude less important than #1 is choosing an appropriate/comfortable exposure to major stock indices
#3 Another order-of-magnitude less important than #2 is choosing the lowest-cost funds
#4 Another order-of-magnitude less important than #3 is everything else

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Sun Aug 05, 2018 6:13 pm

Ok, I assembled my own 'matter/scatter' spreadsheet, based on monthly total returns of well-known indices. I used inflation-adjusted returns, and $100 incremental investments added every month. The X and Y axis are defined like the OP did (I think!). The varying factor is different though, this is a play on the starting date of a 20 years period (each point being a different starting month).

Here is the scatter chart, where I purposefully ignored the oil crisis. I think this chart essentially conveys a "doesn't matter" message. Click to see a bigger version of the chart.

Image

Here is a similar scatter chart where I included all the historical data I have, starting in 1971. If you pay close attention, the previous chart is subset of this one (for obvious reasons!). And yet, when looking at it, it conveys a slightly different message, doesn't it?

Image

And to finish, I purposefully ignored the recent financial crisis. Not quite the same shape as the first one, right?

Image

I don't know what to conclude here, besides the obvious Nisiprius-ism that the start/end dates really really matter. Besides that, those are pretty graphs, but this doesn't enlighten me very much, I have to say. I'll keep playing with a few other data series, just out of curiosity...

PS. Comparing those three charts made realize that the points corresponding to the financial crisis looked like a tornado... Quite true!

PPS. To keep my spreadsheet reasonably compact, efficient and general-purpose, I used the 'gummy magic' formulas from this post, combined with the use of a harmonic mean to compute gMS. For the math/spreadsheet-inclined, this is VERY cool.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Sun Aug 05, 2018 8:24 pm

To the risk of side-tracking the thread... I made another representation of the very exact same data. I find it much more informative than a scatter chart, to be honest. I think I'm going to add something like that to the Simba backtesting spreadsheet, allowing to parameterize the initial portfolio value, and periodic fixed additions (or withdrawals) from the portfolio.

Image

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Sun Aug 05, 2018 9:15 pm

Those are neat, Siamond. I have a chart that is part of my "methodology" discussion that I'm working up to posting that has growth-of-$100/month-contributions over overlapping 26-year periods from 1871 to 2017 for one asset alone--stocks as represented by Shiller's data. The problem of course is that you can't find data for pairs of interest that go back that far.

It raises another issue, which you can see in your chart (and mine when I post it), which is the "filamentous" appearance of the plot due to the use of overlapping time periods. In effect you only have a tiny number of "different" points which are being gracefully connected by curvy dotted lines (produced by each time period duplicating most of the data in the previous overlapping period).

If you use historic data alone, then you encounter problems of a) not enough, b) might not be comparable over the whole time period, and c) either has a tiny number of independent points or else the points get connected by filaments which are not much more than graceful interpolation between those points.

If you use something like my Monte Carlo data, then the problem you have is that it's fictitious data... and no assurance that what is really happening in real life is a cumulative random walk around the historic line.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Sun Aug 05, 2018 11:00 pm

nisiprius wrote:
Sun Aug 05, 2018 9:15 pm
It raises another issue, which you can see in your chart (and mine when I post it), which is the "filamentous" appearance of the plot due to the use of overlapping time periods. In effect you only have a tiny number of "different" points which are being gracefully connected by curvy dotted lines (produced by each time period duplicating most of the data in the previous overlapping period).
I would largely agree with your overall assessment, but... I always found the 'tiny number' argument greatly exaggerated. Some people would assert that in my previous charts, there are less than 3 data points (i.e. 3 fully independent 20 years periods). This makes very little practical sense. Check my last graph. What a difference just a few years between starting points can make on the outcome. What a difference just one year between starting points can make on the outcome. if you squint hard enough, you'll see that in a few cases, a difference of one month made a rather significant difference, which is rather stunning. So yeah, the data is mostly overlapping and this has to be acknowledged, but fact is there are way more than 3 squarely distinct scenarios in there, and a lot to learn from.

This being said, yes, for sure, there isn't enough historical data available, and even less publicly available (which is an absolute shame in itself). And yes, it is somewhat debatable to compare an ever changing present with a distant past (or with another sample, e.g. from another country). Nevertheless, I continue to not see much value in trying to patch up the situation with artificial models ala Monte-Carlo. It seems to me that they only compound the issues we have with historical data (creating more 'compare apple to orange' and 'filamentous' situations), while introducing unrealistic quirks due to the use of overly simplified equations. And then we just can't know if any observation about the results has a root cause stemming from the data itself (i.e. real life), or from the artificiality of the model. Bottomline, I just can't bring myself to trust any outcome of such models.

Still, thank you for a stimulating discussion. I learned a few things in the process. And will use some of it in the next Simba update.

User avatar
grayfox
Posts: 5184
Joined: Sat Sep 15, 2007 4:30 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by grayfox » Mon Aug 06, 2018 9:22 am

siamond wrote:
Sun Aug 05, 2018 8:24 pm
To the risk of side-tracking the thread... I made another representation of the very exact same data. I find it much more informative than a scatter chart, to be honest. I think I'm going to add something like that to the Simba backtesting spreadsheet, allowing to parameterize the initial portfolio value, and periodic fixed additions (or withdrawals) from the portfolio.

Image
That is a very interesting graph. What impresses me is the wide variation in outcomes: about $25,000 to $130,000. 5:1 range.

Also, what I think drives the wide variation is not so much the starting point, but the end point.

Behold:
The highest points are around 1979-1980. CAPE was about 9. Twenty years later was 1999-2000, the peak of the tech bubble when CAPE was over 40. So you end on a bubble peak, you have high outcome.
And the lowest point is 3/1989 when CAPE was 15.30, close to average. Twenty years later is 3/2009, the low of the Great Recession and CAPE was 13.32.

Much of the variation for 20 year periods is due to change in valuation.

:idea: If we knew CAPE 20 years out, we would rule the galaxy!

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Mon Aug 06, 2018 12:01 pm

grayfox wrote:
Mon Aug 06, 2018 9:22 am
Also, what I think drives the wide variation is not so much the starting point, but the end point. [...] If we knew CAPE 20 years out, we would rule the galaxy!
Well, I agree and disagree... You're absolutely right that the valuation level at the end point has a lot of bearing on the final value. But I would argue this is somewhat fuzzy money. The month after the end point, there might be a sudden drop, the portfolio value drops accordingly, but the number of shares didn't actually change. While low valuations at the starting point will help the accumulator buy a lot of early shares at a low price, and reaping long-term benefits. Overall, I think we obsess way too much about instant portfolio value and its ups and downs. What truly matters is the stream of income generated over decades of happy retirement.

Still, I do like this representation. Many thanks to Nisiprius and his interesting idea of using the portfolio end value on the Y axis.

User avatar
Topic Author
nisiprius
Advisory Board
Posts: 39283
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by nisiprius » Mon Aug 06, 2018 12:34 pm

I think I'll post this one now. It doesn't make a comparison. If you want to go back to 1871, all you have is stocks. (Actually that's not right because the Cowles data has, basically, all kinds of detailed sector breakdowns but that doesn't seem to have been exploited too much...) Anyway. It did this mostly to make sure that the spreads I was seeing in my Monte Carlo simulations were not crazily high. These are overlapping 26-year periods, and represent the final value of $100/monthly investments over the 26-year period. I used the Shiller data and calculated monthly total return (including dividends).

Image

Anyway, the data looked really weird to me, which led to my personal discovery of what I really should have known, of what I call the volatility explosion of 1933. That's what that blob of points over at the right is, and it all becomes a little clearer if we literally connect the dots so that we can see what the year sequence is:

Image

I mentioned the "filamentous" nature of Siamond's dots. I actually had a similar chart of rolling overlapping time periods with single month intervals, and I'm sorry to say I threw it out because I really didn't like those stringy filaments.

No real conclusions, just... I thought... another interesting picture. It was a unique event in the time period since 1871. It shows up in any time period that spans the time around 1933... and doesn't show up if it doesn't.

To speak to one of Siamond's points, in a sense there is just no way to get away from endpoint dependence. In my charts, the scatter is--by construction--scattered around the historical averages for a specific time period. The scatter is produced from the cumulative effect of fictitious chance departures from what happened historically. The thing I like about my own method is that those chance departures are actually the real numbers from an adjacent month so I'm not, you know, applying standard deviations from one period of history to another period in history. But, yes, the difference, the difference that may or may not matter, is the specific difference for one specific time period. There's a possibility that a difference that might matter over, say, ten-year periods, happens to have cancelled out perfectly by chance over the full time period, so that a difference that ought to matter looks as if it doesn't matter... etc. etc.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.

User avatar
siamond
Posts: 4956
Joined: Mon May 28, 2012 5:50 am

Re: How much does it matter, considering scatter? Part 1 of many: Introduction

Post by siamond » Mon Aug 06, 2018 1:22 pm

nisiprius wrote:
Mon Aug 06, 2018 12:34 pm
If you want to go back to 1871, all you have is stocks.
Turns out that monthly historical interest rates are also available. And with a bit of work, you can infer a fairly reasonable model for bonds total-returns. Actually, no need to do any work, just PM AlohaJoe as he developed such a tool (derived from the bond fund model provided by longinvest) and has all the numbers readily available. Yeah, it's a model, it's not reality, but comparing with actuals when known, this proved really quite close, as bonds are much more deterministic than stocks.

Post Reply