Home > Main Forums > General Discussion > Backtesting spx data

Backtesting spx data

  1. Some traders comment about doing backtesting using eod spx data without option data. How is this done and how reliable is the backtesting? Without the option bid ask data and volatility data how can they use only spx eod data to do option backtesting?
     
  2. IMHO: I think it depends on the accuracy required of the user. That in turn is a function of what they are attempting to do specifically.
    One would think that if someone doing that, also incorporates SKEW as well as VXST, VIX, VXV, and VXMT for term relationships, the simplistic accuracy could be greatly improved (over no IV data source).
     
  3. What the traders referring is doing backtesting option using spx eod price without volatility data. Are they backtesting using theoretically price? They is also one software saying their software allow backtesting option trades using eod price data without volatility. Wonder how much approximation would the backtesting compare to real
     
  4. What exactly is the EOD data? If it's just the last price then it is useless. If it is a reliable settlement that reflects where the options were actually trading then it just might be useful.
     
  5. I believe the EOD data reflects the daily settlement price which is 15 min after the close.
     
  6. I think Venrcew is saying SPX EOD NOT SPX options, but implying backtesting SPX options and ignoring all IV/VIX, etc. (Correct me if I am wrong)
    IMO: Ignoring the supply/demand for the options (the inference from ignoring all implied volatility data) will result in backtest results of no value to traders that focus on option income related trading. (or any option trading I am familiar with) However, if they are attempting to compute the option prices using some model, such as BSM, the value they supply for the "Volatility" input is the key to qualifying the usefulness of the data (assuming they do everything else correctly). I am unable to determine any value of such a backtest that ignores a primary influence of the results.
     
  7. In order to back test an option strategy you need option prices. Those prices can come from actual trades or settlements on the exchange or from a program that creates theoretical option values. In order to create those theoretical values you need certain inputs which include a volatility value for each option being priced. There is no way to avoid that. The whole thing as described sounds very fishy to me.
     
  8. Yes referring to spx eod data not option data. The way they say is backtesting using spx eod data but didnt detail about how they use them. Also considering they may use theoretical price but without volaility the backtest is not accurate. Btw how is using option eod data to backtest? Unless the product is liquid the option eod data may quite different from real buy sell price?
     
  9. Venrcew: I have attempted {tried to think it through} to address backtesting by using EOD option data, and have almost decided that effort is foolish (at least for me). If I am going to buy the EOD option data, why not go ahead and spend another $800 or so and save my self a lot of agony, to get the 1min data (which should be more resolution than I need). One way to improve the intra-day accuracy for EOD option pricing could be to also use VIX open/high/low/close and Guess at the IV for the day's low and high underlying price by assuming those points coincide with the high and low VIX price respectively, then scale the EOD IV accordingly. While this will have error, it should be better than ignoring IV changes during the day. {Again, by purchasing the finer granularity data, these issues can be avoided}. I would tend to be skeptical of "them" without more info.
     
  10. I assume we're talking about computerized backtesting and not manual, otherwise just use ONE or OVue. Computerized backtesting with only underlying prices (and maybe VIX or some other non-option prices) might be slightly better than worthless for very simple strategies like single options, straddles and strangles, but still very close to worthless. For any strategy with spreads, it would be worse than worthless.

    Computerized backtesting with EOD option mids, plus all supporting data to give you the real skews, can be very fruitful; I know because I've done it and learned a tremendous amount. But since the data is so cheap, even if I were determined to only look at one time per day I would eschew the EOD data and buy some data where I could take something like the 3:00 PM EST chains.
     
  11. Steve S: Thnks for your input. Am I correct in assuming after purchasing the Historic option 1min data for SPX options, you also have the yearly subscription to keep your data current? (nightly downloads or something?) I'm considering similar, so appreciate your comments. I'm ready to throw in the towel on getting OV to backtest.
     
  12. Right Gary, I'm currently working with LiveVol 1min with yearly subscription, but have not yet hooked it into the backtesting code base I developed back in 2013 for EOD data. Right now I'm most interested in doing basic historical statistical studies with the data, and hooking it into a code base for producing current live charts for real trades I have open ... that's going well; the data is quite clean and easy to process and keep current.

    But if you've never done this options data stuff before, don't begin until you know what you're getting into! It's "easy" if you're skilled with processing terabyte datasets and know the ins and outs of working with options data, but "easy" still means taking a month or two off from your preferred hobbies to write the code for handling and processing the data.
     
  13. I've recently started using AlgoNET Explorer for my automatic backtesting. It's made by the same people who created ONE. It provides historical intra-day data down to 5min intervals for all symbols so I don't have to maintain my own database (which I have done before and was a lot of work as Steve S said). I prefer this way because I can just concentrate on building my strategies or analyzing market movement. I mainly use it with SPX and RUT although they seem to have most symbols available. I write my code in C# but VB and wizards are possible too. The reporting is great and really helps me to see what's working and what's not with my strategies.

    Is anyone else using this who would be interested in starting a forum to share backtesting ideas or strategies?
     
  14. Norm, can AlgoNET handle basic rules like rolling a fly structure up or down when a price point is hit, and contracting a structure to reduce risk when a DTE point is hit?
     
  15. Yes it can do both of these (rolling and changing structure) based on your own criteria being met (DTE, DIT, IV, profit/loss, position or individual leg Greeks etc). The wizard provides some useful examples and you can quickly copy the code that it generates and add your own rules. I'm currently developing an advanced strategy to manage by the Greeks.
     
  16. Sounds really good Norm ... based on what you are saying and considering the immense hassles of building one's own database and huge time investment of writing one's own backtesting code, I would recommend trying AlgoNET first before building everything from database on up, for anyone who is primarily interested in backtesting.
     
  17. Probably most of us who have ONE have looked at that web site before ... I took a look again just now and it looks like not much new going on (in some kind of beta) ... can you relate any specifics about how easy it is to get on board with them, and what the costs are?

    Also, since you're interested in starting a forum it sounds like you aren't holding your results close to the vest ... can you give an example or two of strategies you have tested, what kind of results you got, and how the basic reporting is structured?
     
  18. I have NO experience with "processing terabyte datasets" and that is one reason I have not bought the data. I have coded a mechanism (with Perl) were I can review my trades using TOS OnDemand (and realtime) which has access to the data I need, but is almost as painful as dealing with OptionVue for waiting for the date/time changes to update from TOS and some RTD opportunities with Excel. -- For some time, I have thought I could calculate what I needed faster than trying to access it in a huge database -- While I have learned a lot, I think using the data will simplify my efforts and make debugging a bit easier. My plans would be first to merely extend this to use ivol data instead of TOS, so I can control the timeline from the code, instead of manual with OnDemand, OptionVue, and ONE. I have some familiarity with working with options data (more than some, less than others I imagine).
    Any ideas on pricing for AlgoNET, or when it will be avail for mere mortals? It sounds too good to be true for capabilities.
     
  19. Nothing beats being in complete control from raw data on up ... just a question of how much work you're willing to do to get there and stay there. If you decide to try LiveVol I could tell you a few more things about it that would be useful.

    Actually it's quite "easy" to code up a backtester, so definitely not "too good to be true" ... but I say "easy" in the same sense as above ... if it's your hobby then definitely do it yourself, otherwise weigh the costs carefully and consider something like AlgoNET.
     
  20. I’ve been a ONE customer for some years. I joined the AlgoNET beta a couple of months ago and they recently released their final product, which I purchased for around $1k for 12 months. I then got full access to their historical data (the beta was limited to 6 months data). I’d email them if you want to try it.
    I’ve tested quite a few different strategies so far. They give sample code for condors, verticals, butterflies and calendars. I’ve been able to create almost anything I wanted from those examples. I’ve had some encouraging results testing butterflies and BWB and I’m now looking at building more complex layered positions with different types of adjustments.
    Reporting is very cool and shows you a main equity curve intra-day chart with all positions on. As you track your cursor over that chart, the different positions are shown in more detail in other windows. You can also sort a results table by various criteria, for example, if you are just concentrating on your losers to see why they didn’t perform well. There is also a statistics window (similar to in ONE) showing your winners/losers/largest win/loss etc.

    You can’t reference technical indicators yet but I requested they add that during the beta.
     
  21. Thanks for the feedback Norm. I'm good for SPX, but I think I would test drive AlgoNET before I would tackle the database project for a new underlying.

    I'm wondering about the "script editor" thing ... shouldn't it be straightforward to reference any technical indicators you wanted (and do tons of other things) by including them in your own code and just hooking to their backtesting engine? For example, in a script can I pull a year's worth of SPX underlying time series from their data and do my indicators from this, then make my trade adjustments?

    Also wondering about hedged strategies, especially because ONE doesn't do futures yet ... do they have the price data all lined up so I can run, for example, an SPX strategy that is dynamically hedged with ES and SPY (would only need the ES futures here, not the options)?
     
  22. You create your strategy by writing code that’s called every tick. You have access to the underlying market data (OHLC) each time the event is called so I guess you could write your own indicator (calculated each tick) and use that to trigger adjustment decisions, although I haven’t tried this.

    I don’t know if you can trade other symbols to hedge but they don’t have futures yet. All the strategies I’ve written so far have been for an individual symbol.
     
  23. Thanks for taking the time to answer all the questions Norm ... very good to have this information.
     
  24. No problem Steve. Would be great to get a forum together for this type of strategy development and automatic backtesting. I think it has huge potential.
     
  25. Steve, I'm interested. I looked at the pricing from https://datashop.cboe.com/. One year (subscription) of SPX costs between $56 (30 min), $104 (5min) and $144 (1min). The whole history back to 2004 is between $350 (30min), $650 (15 min) and $900 (1 min). This is only for the bid/ask quotes, no greeks. Is that the pricing that you paid? Or am I looking at the wrong product?

    I'm also very interested in your experiences processing all these data. Thanks a lot.
     
  26. Hi uwe, that's the product I have, and it's pretty much the pricing that I saw when I signed up - very reasonable.

    Regarding tips on working with the data, I could write 10,000 words right now just to start (same as for any options data project), but whatever would be of actual value to you depends on your skill set, how picky you are about making sure the data is good, how far back in time your history will go and the frequency (1 min, 5 min, etcetera) you will be working with.

    One problem with the LiveVol data is that the "issues" you encounter may be different than the ones I had to solve, since their downloads have been changeable in my experience ... so for example, if I gave you a list of all the data scrubbing items I had to resolve you may or may not see those items in your downloads, and you may see new ones that I never had to deal with ... so best to proceed point by point. Just post any questions you have on this thread, or any thread, and I will try to help without adding to the confusion.
     
  27. You might want to PM Ron Bertino - he seems to be tight with the ONE guys and I wouldn't be surprised if he has been thrashing AlgoNET ...
     
  28. Steve: A deviation from the thread, but: Can you comment on how you compute IV for each option? I'd assume you may use a BSM (or near relative) and resolve the IV to produce the observed price (typically Mid price), but for the underlying price, do you use the Spot price or the Forward price? For SPX, do you actually provide a Dividend input? (adding a dividend seems to add error, so I have been leaving the dividend as zero, as there is really no payout like SPY). I have been using the CBOE VIX white paper algo for Forward price (curious of your opinions).
     
  29. Hi Gary, this is a "small topic" that gets quite big if you really cover all the bases. I've had several people ask me about this lately so I've been collecting different bits and pieces in a single document; I'll post the latest version later today after I add some interesting images relating to implied interest rates.

    Here's my short answer though, and if it answers your question you won't have to download the pdf:

    Just use any garden-variety BSM implied vol calculator with underlying = your estimated SPX-PVDiv, Yield = 0. You are using the PVDiv model where all the yield is in the PVDiv, so Yield is not applicable. Your forward F = Exp(RT) * (SPX-PVDiv), so SPX-PVDiv = Exp(-RT)*F.

    The CBOE VIX determination of "F" is what I would call "stylized" if it's the one I'm thinking of where they only use the closest-to-the-money strike: It's a fairly poor estimate of the actual forward at the time the chain is snapshotted, but CBOE knows that and they are going after simplicity over accuracy. In fact, the near-the-money strikes are the WORST ones for accuracy of the forward because the F-X values have the smallest magnitudes (hence more random error) and the largest bid/ask errors (more random error, and arguably some systematic error). I would definitely recommend a better estimation of the true forward; see my pdf notes for an example of a very simple improvement, and I also have a sample spreadsheet I can upload if there is any interest.
     
  30. Hi again Gary, here's the doc I mentioned in the last post. If you've lost interest then no worries, I was looking for an excuse to finish organizing these thoughts anyway.
     
  31. Steve: Thank you for the previous post (I am correcting a bug I had that your response exposed).
    In reading this chain_boostrap.pdf, on page #2, you state "
    1) get a decent interest rate "R" for this (date, DTE) pair (one rate for all chains all day); " <-- Should the fed rates be interpolated to match the DTE, or should the nearest Fed rate term (4Wk, 13Wk, 26Wk, 52Wk) be used without interpolation for the precise date? (I'd guess the latter, but may as well not guess if you know) (IE, for a 2DTE, should I just use the 4Wk interest?)

    I'm still reading, so will likely have more questions.
    Thank you for your assistance!
     
  32. After you read the obscenely voluminous comments on rates (and check out my charts, and that Hull paper) you will have a better idea of what questions to ask ... but just so you know, I'm in the camp holding that Treasury rates have never been the best choice, and never will be - if they happen to be better than something else for a particular date and DTE, that's by accident and not because you should be using Treasury rates.

    The answer to your question depends on whether your data is actual treasury yields, or bootstrapped zero rates. If nobody has done it for you then you have to bootstrap, and that's a drag. After bootstrapping you do linear interpolation of zeroes.
     
  33. I know that could change and in fact I'm praying it does, but with rates in the range we're trading options this low, is it worth all the effort?
     
  34. Arguably no, and that's a point I tried to make repeatedly in my notes.

    Arguably yes, because it's neither the absolute level of rates nor the volatility of rates we are talking about (both of those are MUCH more interesting topics!), but the way rate estimation errors impact the calculation of volatilities. For any individual trader to figure out the magnitude of error where you being to care, you have to actually spend some days/weeks/months looking at the numbers while you do your skews.

    Arguably yes, because if you're not prepared there's going to be a lot of scrambling on these trader community web sites with accustomed models going screwy and they don't know how to fix things because they have never traded through even a "normal" rate environment, much less a truly volatile one.
     
  35. Steve S:
    Thank you so much for the detail provided in your chain...pdf.
    From my current implementation of the algorithms you detailed (I may have some additional work needed), the derived IV for the strikes I have most interest in, seem viable (have not spend time validating)

    A side question for you regarding observing IV:
    Have you observed IV Surface of SPX by substituting the Y axis of Delta or Strike, by "Moneyness"? This seems to make interesting characteristics of IV more easy (for me) to observe. And can you comment on the plots below where I attempt to show the IV surface for Call's and Puts on SPX for today?
    I found a moneyness formula that also encompasses Time, which seems to make better visualization of the IV surface. (less mental gymnastics required)
    I am using the following for PUT moneyness:
    ln(price*exp(dte*intrate/365))/strike)/((dte/365)**.5)

    The calculated IV obtained by using the "U" value derived from your PDF, as the underlying price input to BSM model iteration for the specific OPRA Mid price.

    For CALLs, swap positions for "price" and "strike";
    Below is a chart I just extracted for reference (cannon fodder).
    Legend:
    RED dot: The calculated IV of a specific OPRA with Volume == 0;
    Green dot: The calculated IV of a specific OPRA with Volume > 0;
    BLUE dot: The TOS "Individual Implied Volatility" value for the specific OPRA.
    Note: Any IV shown as zero (0) is not real, but indicates the data is suspect. <-- Note: these all were options with No trades today.
    upload_2016-10-20_11-54-53.png
    upload_2016-10-20_11-56-38.png

    When I changed the TOS calculation to Vol Smile, the data retrieved from TOS seems questionable: Below are similar graphs with TOS set to Vol smile.
    upload_2016-10-20_12-13-1.png
    upload_2016-10-20_12-15-38.png
     
  36. Yes you are on the right track with translating to Black-Scholes space from strike space by viewing your skews as functions of Black-Scholes standard deviations ... this is the correct space to work in when comparing skews across DTEs. You can take your independent variable to be Ln(F/X)/Sqrt(T) as you are doing, or if you are comparing different vol regimes it can frequently (not always, and you will hear arguments about this) be better to also add the atm vol so your variable is Ln(F/X)/[Vol_atm * Sqrt(T)].

    Regarding your charts, TOS and your calculations: For a given chain there is only one correct implied volatility (obviously with small variations depending on algorithmic details) ... everyone who is doing the job right will agree, and if someone disagrees then they are wrong, not just "doing it different".

    I have TOS but don't really use it, but from what I have heard they don't do their forwards correctly so their skews are wrong and the smoking gun is different call and put skews. I do use OVue so I know from personal observation that they have the same ridiculous problem. Their vols will still be in the same ballpark as yours - the errors aren't crazy wild, otherwise nobody would use TOS or OVue. You can still check your vols against theirs whenever you get paranoid about your numbers, just don't expect to be really close.

    What you need is to gain confidence that they way you are doing your vols is "right", period, and then you won't worry about anyone else's vols, except occasionally when your vols or greeks look weird, and then you do a spot-check against someone else to make sure your system is working.