nisiprius wrote: ↑Sat Jun 08, 2024 7:44 pm
...Variance drag assigns a dollars-and-cents cost of variance. If it doesn't exist, given two investments with equal return, are there any non-psychological reasons to prefer the one with lower variance? If so, what are they?...

Absolutely. Given two portfolios with equal expected returns, if one portfolio has lower variance, it is in some sense more likely to achieve its expected return.

That isn't clear to me. In what "sense" is that true? Are you saying something about skewed distributions, medians, and means here?

If you plan to invest for 20 years, but find after a year that you may need some of the money much sooner, you may choose to reallocate to a lower variance portfolio to reduce uncertainty about the value of the investment when you need some of the asset. I'm pretty sure you would not be simultaneously trying to increase expected return.

Fundamentally, the claim that if two portfolios have the same expected return, but one has greater volatility then it has lower return is a logical contradiction. If the return is so impacted, then the premise that they have the same expected return is contradicted.

Variance drain is not even a candidate for a valid concept until the following question I posed upthread is answered.

If we have a random sample of years, and a return data point for a portfolio for each of the years, what is the interpretation in terms of investment performance of the geometric mean of the sample data points?

Northern Flicker wrote: ↑Sat Jun 08, 2024 8:54 pm
Fundamentally, the claim that if two portfolios have the same expected return, but one has greater volatility then it has lower return is a logical contradiction. If the return is so impacted, then the premise that they have the same expected return is contradicted.

That is my conclusion also. The problem is how does it come about in the real world that two portfolios that have had the same arithmetic average return have different enough volatility that the CAGRs are significantly different. It is easy to create demonstrations with made-up numbers but finding actionable examples for real portfolios where the issue matters is another thing.

An advisor lying about someone's portfolio by reporting the arithmetic average return they have gotten but calling it the CAGR is a different issue. But even then it is only for high returning and high risk portfolios that it makes much difference in actual numbers.

nisiprius wrote: ↑Sat Jun 08, 2024 7:02 pm
I don't believe there's enough consensus on this to justify doing anything much about the wiki. This is as much a terminology debate as anything else. A web search convinces me that there is a family of similar terms, all referring to differences between arithmetic and geometric terms. These terms are in very widespread use, with the predominant opinion being that it's a legitimate concept. There are also demurral, but no clear consensus.

I agree it is a terminology discussion. Given a set of data points a person can calculate all sorts of different descriptive statistics, some of which vary from others in a systematic way. That is just inherent in the math involved and therefore legitimate. However . . .

I don't like the article in present form as it presents a baldly stated discussion of a nuance without context and as if the issue is a fundamental of investing when there is in fact little practical importance to it.

Of course you can also argue that nobody actually reads that article anyway except when a bunch of investment nerds start asking questions about it on a forum.

Northern Flicker wrote: ↑Sat Jun 08, 2024 8:54 pm
Fundamentally, the claim that if two portfolios have the same expected return, but one has greater volatility then it has lower return is a logical contradiction. If the return is so impacted, then the premise that they have the same expected return is contradicted.

That is my conclusion also. The problem is how does it come about in the real world that two portfolios that have had the same arithmetic average return have different enough volatility that the CAGRs are significantly different. It is easy to create demonstrations with made-up numbers but finding actionable examples for real portfolios where the issue matters is another thing.

An advisor lying about someone's portfolio by reporting the arithmetic average return they have gotten but calling it the CAGR is a different issue. But even then it is only for high returning and high risk portfolios that it makes much difference in actual numbers.

Right. Generally investors will require a higher expected return to compensate for higher volatility.

Yep, I totally agree. You are right on. But it does take a small measure of mathematical sophistication to go from a first base set of definitions to a concept of generalized definitions. Examples are models that are not linear but exponential and lots of other possibilities, generalized concepts of average, and so on. Along the way there is a huge loss of audience just taking the first step.

rule of law guy wrote: ↑Sun Jun 09, 2024 10:31 am
my takeaway? track $ not % gains/losses

The only data I have kept over time is portfolio value and for rebalancing purposes value of distinct assets. I don't have data about return anywhere in anything I actually look at.

As long as one understands what they are doing, calculations can be made in millinepers, percent, hexadecimals, or in greek numerals.
Problems arise when somebody tries to use formulas they don’t understand.

We can make the argument simpler. The variance drain argument is not even an argument about investment performance, but is an argument about probability distributions in general. It contradicts the law of large numbers, which is the implication of the central limit theorem I used to before when describing how a sample mean converges in distribution mean in the limit. That is the law of large numbers.

If you flip a coin repeatedly, the cumulative number of heads you get will tend toward 50% of the total as the number of flips increases. That is a demonstration of the law of large numbers Variance does not lead to getting fewer than 50% heads.

If the distribution of return of a portfolio is not changing over time, the same applies to the future investment returns of a portfolio. (The claims of variance drain are for an unchanging distribution, so we can leave distribution changes out of the analysis). What is true about variance and the law of large numbers is that if the distribution has higher variance, it will on average take more trials to converge within a certain range of the mean than if it has lower variance.

This property is very salient to investing. If a portfolio has a higher variance, we need a longer horizon to be reasonably confident in receiving its expected return than if the portfolio has a lower variance. This is why we care about volatility.

But if higher variance were a drain on return, there would be no mechanism for the law of large numbers to apply. Variance drain would limit the ability of a sequence of trials eventually to match the expected return. It also would worsen with longer horizon, not improve.

By the way, statistician Karl Pearson once responded to naysayers about the existence of centrifugal force law of large numbers by tossing a coin 24,000 times and recording 12,012 heads.

Last edited by Northern Flicker on Sun Jun 09, 2024 4:10 pm, edited 3 times in total.

Northern Flicker wrote: ↑Sun Jun 09, 2024 1:05 pm
By the way, statistician Karl Pearson once responded to naysayers about the existence of centrifugal force law of large numbers by tossing a coin 24,000 times and recording 12,012 heads.

rule of law guy wrote: ↑Sun Jun 09, 2024 10:31 am
my takeaway? track $ not % gains/losses

The only data I have kept over time is portfolio value and for rebalancing purposes value of distinct assets. I don't have data about return anywhere in anything I actually look at.

%s are for comparing yourself to others...since it is indelicate to talk to others at a cocktail party about your $s. but $s are what you spend, so $s are the prize to keep your eyes on

Northern Flicker wrote: ↑Sun Jun 09, 2024 1:05 pm
By the way, statistician Karl Pearson once responded to naysayers about the existence of centrifugal force law of large numbers by tossing a coin 24,000 times and recording 12,012 heads.

That was extremely lucky !

If you were to repeat the experiment a million times, what percentage of the time would you get between 11988 and 12012 heads?

Northern Flicker wrote: ↑Sun Jun 09, 2024 1:05 pm
By the way, statistician Karl Pearson once responded to naysayers about the existence of centrifugal force law of large numbers by tossing a coin 24,000 times and recording 12,012 heads.

That was extremely lucky !

If you were to repeat the experiment a million times, what percentage of the time would you get between 11988 and 12012 heads?

The probability of getting exactly i heads in 24,000 coin tosses is {24000 choose i}/2^(24000). Here {n choose k} is the binomial coefficient, which is n!/(k!(n-k)!). It counts the number of ways to choose k objects out of n distinct objects. So the probability of getting between 11988 and 12012 heads, inclusive is sum({24000 choose i}/2^24000,i=11988,12012). Wolfram Alpha says it is approximately 0.128. Changing the range from 11900 to 12100 will have probability approximately 0.806.

Edit: Typo.

Last edited by student on Mon Jun 10, 2024 3:32 pm, edited 1 time in total.

Northern Flicker wrote: ↑Sun Jun 09, 2024 1:05 pm
By the way, statistician Karl Pearson once responded to naysayers about the existence of centrifugal force law of large numbers by tossing a coin 24,000 times and recording 12,012 heads.

That was extremely lucky !

If you were to repeat the experiment a million times, what percentage of the time would you get between 11988 and 12012 heads?

The probability of getting exactly i heads in 24,000 coin tosses is {24000 choose i}/2^(24000). Here {n choose k} is the binomial coefficient, which is n!/k!(n-k)!. It counts the number of ways to choose k objects out of n distinct objects. So the probability of getting between 11988 and 12012 heads, inclusive is sum({24000 choose i}/2^24000,i=11988,12012). Wolfram Alpha says it is approximately 0.128. Changing the range from 11900 to 12100 will have probability approximately 0.806.

Northern Flicker wrote: ↑Sun Jun 09, 2024 1:05 pm
By the way, statistician Karl Pearson once responded to naysayers about the existence of centrifugal force law of large numbers by tossing a coin 24,000 times and recording 12,012 heads.

That was extremely lucky !

If you were to repeat the experiment a million times, what percentage of the time would you get between 11988 and 12012 heads?

The probability of getting exactly i heads in 24,000 coin tosses is {24000 choose i}/2^(24000). Here {n choose k} is the binomial coefficient, which is n!/(k!(n-k)!). It counts the number of ways to choose k objects out of n distinct objects. So the probability of getting between 11988 and 12012 heads, inclusive is sum({24000 choose i}/2^24000,i=11988,12012). Wolfram Alpha says it is approximately 0.128. Changing the range from 11900 to 12100 will have probability approximately 0.806.

Edit: Typo.

In excel BINOM.DIST.RANGE(24000,0.5,11988,12012) returns 0.128. Just another way to get to the same place.

McQ wrote: ↑Thu Jun 06, 2024 10:09 pm
Alas, it appears that the late Harry Markowitz went to his grave not understanding the arguments made by forum member Northern Flicker, among others here in the thread.

Why Inputs to a Mean-Variance Analysis Must Be Arithmetic Means

That’s the section title.* He continues:

“The inputs to a mean-variance optimizer must be the (estimated forthcoming) expected (that is, arithmetic mean) returns rather than the (estimated forthcoming) geometric mean returns.”

That view is 100% aligned with mine.

Actually, with further thought, I don't fully agree with that position. The problem is that a sample of non-random and correlated data points does not satisfy the preconditions for using the central limit theorem as a theoretical underpinning for parameter estimation, so there is no reason based on standard statistical methods to assume that the arithmetic sample mean is a decent estimate of the distribution mean. (There are more sophisticated methods for dealing with non-random and biased samples). Given the flaws in the experimental design, standard statistical methods don't give us a basis for saying whether the arithmetic mean or geometric mean is a better estimate of the distribution mean, aka expected return.

Acknowledging that saves us from making a flawed inference from a flawed experiment to believe that probability distributions in general have a property that contradicts the law of large numbers.

Note that there are some distributions that do not meet the conditions of the law of large numbers, such is if their mean is infinite. I would love to have an investment with an infinite expected return, but I don't think I ever will. And there are some proofs of the law that require the first 3 or 4 moments to exist, not just the mean. I believe the law just depends on the convergence of the distribution mean and distribution variance to finite values, and the sample being a random sample of independent trials.

McQ wrote: ↑Thu Jun 06, 2024 10:09 pm
Alas, it appears that the late Harry Markowitz went to his grave not understanding the arguments made by forum member Northern Flicker, among others here in the thread.

Why Inputs to a Mean-Variance Analysis Must Be Arithmetic Means

That’s the section title.* He continues:

“The inputs to a mean-variance optimizer must be the (estimated forthcoming) expected (that is, arithmetic mean) returns rather than the (estimated forthcoming) geometric mean returns.”

That view is 100% aligned with mine.

Actually, with further thought, I don't fully agree with that position. The problem is that a sample of non-random and correlated data points does not satisfy the preconditions for using the central limit theorem as a theoretical underpinning for parameter estimation, so there is no reason based on standard statistical methods to assume that the arithmetic sample mean is a decent estimate of the distribution mean. (There are more sophisticated methods for dealing with non-random and biased samples). Given the flaws in the experimental design, standard statistical methods don't give us a basis for saying whether the arithmetic mean or geometric mean is a better estimate of the distribution mean, aka expected return.

Acknowledging that saves us from making a flawed inference from a flawed experiment to believe that probability distributions in general have a property that contradicts the law of large numbers.

Note that there are some distributions that do not meet the conditions of the law of large numbers, such is if their mean is infinite. I would love to have an investment with an infinite expected return, but I don't think I ever will. And there are some proofs of the law that require the first 3 or 4 moments to exist, not just the mean. I believe the law just depends on the convergence of the distribution mean and distribution variance to finite values, and the sample being a random sample of independent trials.

Do I not understand that the problem you are citing here is about the nature of the sampling and not about the idea that with proper sampling the right estimator would be the arithmetic mean. The problems lie in such things as samples not being random and independent as we assume them to be, and so on. This originates from our collection of return data being points in a time series process and not simply repeated draws from a static distribution. It comes to what statisticians distinguish as time sequence data vs cross sectional data, I guess, or maybe panel data would better describe what we really want. In any case mean variance analysis appears to involve cross sectional data while people worrying about geometric average return are clearly trying to characterize a time series. Probably the best thing to do would be to try fitting a curve as a function of time to the data and extracting parameters of the curve, such as the time constant for exponential growth of asset value, if we think that is a good model.

PS I still think the only real issue being discussed here is the sometimes practice of taking a bunch of returns over time, understanding that by the definition of return that the CAGR properly reports the growth of the investment, but for some bizarre reason deciding to compute the arithmetic average of those numbers as if the CAGR being less than arithmetic means we are somehow being cheated. It is the old silly comment that after a 50% loss you need a 100% return to get back to even as if there is something amazing and maybe nefarious in that observation.

A historical backtest of N consecutive years is essentially (or is equivalent to) a sample of size 1 of N-year return. The return can be annualized as a CAGR. The annual return view enables a calculation of volatility, though there is nothing sacred about calendar year boundaries, and you will get different sample volatilities when you divvy up the N-year return in different ways.

If a random sample of independent single-year returns were collected, then we would have the sample mean as an estimator, with all of the properties sampling theory says it has, including satisfying the law of large numbers. This assumes the distribution is not changing over time, which is questionable at best given the span of time that would be needed. Geometric means of a bunch of random annual returns could be computed, but would have no particular value.