some background:

A few months ago, Nathan presented his free web tool for performing Monte Carlo sims on a position.

For the distribution of future price action, he uses a Normal Distribution.

During the presentation, Jim Riggio asked about substituting a different distribution: Possibly collecting historic data for the distribution.

The response was something, like "it is not easy to do" (I forget, the reasoning, and at the time did not give it adequate thought).

so:

Now, I would like to revisit the topic, of substituting a real distribution, as a proxy for estimating future price action.

The theory: using a real distribution associated with the underlying should produce more accurate results than using a Normal Distribution. (Seems logical)

I do not have a strong statistics background, so would like feedback from those with statistical background on likely errors/assumptions in my pursuit.

It is my desire to produce a Cumulative Distribution Function from binning Historical returns, that can be applied to Future time values, by indexing by STD DEV, where I merely map the STD DEV to the CDF bin for all bins! My first concern is if errors are introduced by using historic "daily returns" to map to larger time estimates (1->100 Days). -- I should be able to work out the CDF (will probably merely use the BIN probabilities) -- I do not yet anticipate difficulty here.

Using Excel's "Descriptive Statistics" Analysis tool for examining both a daily % Return and a weekly % Return on data from 1986 to present, produces the following results:

DailyMean 0.037449 Standard Error 0.01275 Median 0.057263 Mode 0 Standard Deviation 1.139284 Sample Variance 1.297968 Kurtosis 21.35565 Skewness -0.8324 Range 32.04696 Minimum -20.4669 Maximum 11.58004 Sum 298.9918 Count 7984 Largest(1) 11.58004 Smallest(1) -20.4669 Confidence Level(95.0%) 0.024994 ~

weeklyMean 0.144179 Standard Error 0.062901 Median 0.291831 Mode #N/A Standard Deviation 2.290484 Sample Variance 5.246317 Kurtosis 5.409974 Skewness -0.72341 Range 28.90254 Minimum -18.1955 Maximum 10.70707 Sum 191.1809 Count 1326 Largest(1) 10.70707 Smallest(1) -18.1955 Confidence Level(95.0%) 0.123396

Note: The Weekly is sloppily created by only capturing each Friday close, vs the prior Friday close, therefore occurrences of non-trading Fridays, will result in one less sample, but the period of the next will cover 2 weeks instead of one. I don't think this sloppy sampling introduces significant error. All data is % return (sample-sample[-1])/sample[-1])% (where "-1" means prior sample)

Can someone with a good grasp on statistics confirm that the above Daily and Weekly summaries do NOT suggest a flaw in using Daily return data to produce a CDF that will be used to infer different ranges of future time deltas?

BTW: This is forSPXonly!

Thanks in advance.

PS: After this, other questions will be pursued, such as relevance of some historic periods, etc.

Another perspective: Looking at these same Daily returns, but limiting to Jan 2012 to present, results in the following summary from the Excel function:

(With the volatility reduction in recent years, makes it less clear there is a lot of value in abandoning Normal Distribution for Historic)

Daily Returns 5yrMean 0.049319 Standard Error 0.020515 Median 0.038178 Mode 0 Standard Deviation 0.774689 Sample Variance 0.600144 Kurtosis 2.265039 Skewness -0.26712 Range 7.844751 Minimum -3.94137 Maximum 3.903385 Sum 70.32824 Count 1426 Largest(1) 3.903385 Smallest(1) -3.94137 Confidence Level(95.0%) 0.040243