## Go Ex Ante With Parameters and Refute the Null Hypothesis... or Else!
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

## Better Than Best?On the way to putting together the dismal news about the terrible haircut of Figure 1, a surprise happened. With another one of the portfolios, the International one (defined below), and with the same chosen initial parameters of Figure 1, the walkthrough procedure yielded a We were perhaps expecting, given the parable of the painting contractor, to find that the walkthrough method would We could theorize that for particular circumstances giving the program a choice of lookback periods causes it to abandon the longer periods for the shorter ones whenever an abrupt and very substantial move happens, say, right at the start of a deep plunge or a sudden recovery. For example, just after a plunge has started the shorter lookback periods will be performing better because they will have quickly indicated that the scheme should go into cash. Thus the scheme, selecting as it does the best-performing lookback period on a trailing basis, will put the portfolio into cash rather quickly, and vice versa for sudden rallies. Otherwise, during strong, steady rising markets the longer lookback periods will be performing better because they will not be generating whipsaws every time that there is a little pullback, so the program will then help us avoid getting hurt by whipsaw losses as it will automatically take the advice, so to speak, of the better-performing longer lookback periods. That adaptability is not there if a single, fixed lookback period is used. Such inner workings as that, if they persistently occur, would be likely to take the form of the program helping us to avoid participation in bear markets while keeping us invested most of the time in bull markets. But whatever the Sharpe ratio outcome of the walkthrough procedure, our reliance will be on the considerable extent to which the procedure faithfully simulates the use of our scheme in the past— upon that and the second haircut that we shall administer after we examine the extent to which the walkthrough-procedure-defining parameters that are optimal for this portfolio remain optimal when applied to other portfolios. We will see that it would be utterly wrong to suppose that the subsequent Monte Carlo analysis could somehow compensate for our failure to do a true simulation, the like of which we will come closer to bringing about when next below in this article, for Figure 3, we also submit the determination of the maximum number of securities held to the walkthrough procedure. ## The Roadmap to Recover from ThisSo the score is that we have two brutal markdowns— Figure 1, and Figure 2 which is a buzz cut from the blue to the cyan trace— versus the one markup that is mentioned directly above. It's unavoidable that we permit the markdowns (and markups if we're awarded them) to happen as no program can be devised that can figure out in advance that setting n the maximum number of securities held to 5 or the lookback period lp to 11 as on Figure 1 would eventually work out to be the best. But we have yet to conduct the optimization of the walkthrough-procedure-defining parameters— the range limits of n and lp and the characteristic period m of the trailing exponential moving average, that are used by the walkthrough procedure to select the best trailing n and lp values. The good news is that we'll find it to be appropriate to adjust the lower limit of the n range substantially upward from 1, the possible need of which is anticipated on the previous page, and when we do so there will be a resultant huge boost in performance. This development has implications regarding the very degree to which relative strength should be relied upon when the simple and popular price ratio is used to measure the amount of momentum, so the exercise will be quite meaningful even for those who have no strong interest in algorithms. The general plan here is that we started above, and will continue below for a bit, with non-optimized initial values of the walkthrough-procedure-defining parameters. They are presented so that we can see what happens when we adjust them toward optimization— mainly so that we can thereby discover the depth of that second haircut that we must administer as a final step, as is explained above. So next we'll summarize the results that are obtained with those initial values of the walkthrough-procedure-defining parameters, and then we'll go on to the optimizations. ## Initial Outcomes, Before OptimizationWe will be working with the following lists of ETFs, three of them, and from each a portfolio will be formed. Inasmuch as our program dynamically adjusts the position sizes in each security, and the size may be zero, at any one time only a few or perhaps even none of the securities may be actually in the portfolio. -
**International:**SPY, MDY, EWA, EWC, EWG, EWH, EWJ, EWW, EWS and EWU. These are the ticker symbols of 10 famous and very liquid ETFs of a developed-nation flavor that are very popular with investors of every stripe and which have all traded for roughly 20 years. As such they represent pure liquidity, for these funds all hold big- or mid-capitalization stocks that are traded heavily in their own countries as well as internationally. -
**Mostly-USA:**DIA, EEM, IWM, IWV, IYR, QQQ, RSP and SPY. This list is really a rather random assortment of 8 of the most popular ETFs held by investors in the United States. IYR holds REITs and the like; EEM covers emerging markets; the others are diversified US equity funds but note that IWM holds small-capitalization stocks. -
**Sector:**XBI, XES, XHB, XLB, XLE, XLF, XLI, XLK, XLP, XLU, XLV, XLY, XME, XOP, XPH, XRT and XSD. These 17 ticker symbols are of industry sector funds. Unfortunately some of these were launched as recently as 2007. Hence the historical records only show performance through part of one crisis, the Lehman Brothers/subprime-housing debacle of 2008, and the chart below for this portfolio begins about a year after prices reached their peak. A few others of the X series are of even more recent vintage and have been omitted.
## Cumulative Returns, Traded and BenchmarkFigure 3 shows the unoptimized outcomes at a glance, for all three portfolios. On it the cyan trace represents our full-blown “complete walkthrough” procedure selecting the best-performing n and lp on a trailing basis; for the magenta trace n was constrained, set equal to the total number of ETFs on the given list. Do notice that the magenta trace for the International portfolio really looks like something that anyone would love to own as it shows hardly any significant declines but finishes with a market-beating result. What does setting the maximum number of securities held equal to the number of securities on the list mean? It means giving up on relative strength, that's what it means, because with that choice we only select securities to hold based on whether or not their trailing return ratios exceed that of holding cash, without a focus on just owning the securities with the very best trailing performance— hence the label “Traded Portfolio (without relative strength)” for the magenta trace. We won't dwell much on Figure 3, as those results are not finalized, but we can say that the Traded Portfolio traces do for the most part avoid the debacles (except for the embarrassment of the “complete walkthrough” with the International portfolio during the 2000-2003 dot-com crash). And the “without relative strength” plots are generally much less volatile than the others. To be sure, we could not accept that sort of overall performance relative to that of the benchmark that we see in Figure 3. However, we have more to do to finalize our program— we have to deal with ex post setting of the range limit parameters and of the characteristic period m of the trailing exponential moving average. The finishing steps will boost the expected performance substantially beyond what we have seen thus far. ## Sharpe Ratios and p ValuesEarlier in this article the Sharpe ratio was introduced as a suitable measure of portfolio performance, traded or not. And in Part A we reviewed the concept of the “p value” as the odds that the results are as good as they look due entirely to chance, not to the successful exploitation of a systematic effect. We can compute the Sharpe ratio for any record, but obviously if our scheme produces a Sharpe ratio that is less than that of the benchmark there is no point in proceeding with an attempt to refute the null hypothesis because we already know that the null hypothesis has won. The last two columns of the table below pretty well tell the story of our research thus far. We are still paused just now short of finalization but will tend to that next. It should be kept in mind that the Sharpe ratio has a rate of return in excess of that of cash in the numerator and a measure of volatility in the denominator. Hence at times the portfolio that finishes highest in dollars, meaning highest in the overall rate of return, may be graded using the Sharpe ratio as being not as desirable as another that finishes not quite as high but with a history of less volatility.
Comparing the cyan data to the magenta data on the table, the former pertain to our “complete walkthrough” runs and the latter to our “without relative strength” runs. Note especially that the latter beat the former. That's telling us something, something that we'll bear in mind when it comes time to set the lower range limit of n, the maximum number of positions held. We can also see that the p values for our complete walkthrough runs are all telling us that they are not good enough (we'd like p<0.05). ## Optimizing the Walkthrough-Defining ParametersOne of the unsettling things about what we're doing is that ideally we'd like to have a tremendously long historical record for each of the securities that we want to trade, with all of it being representative of the way that the market is currently working. But of course that can't be. We generally follow ETFs from inception, which is at most 20 years or so ago. One can even argue that 20 years could possibly be too long as the amazing thing called “high-frequency trading” didn't exist back then and seems to have the market by the throat now. So what do we do? Well, to a degree it seems that if we take a scheme that we have developed and tested on one set of securities and then apply it to another, a different set of securities of a not-too-different character, that we will be doing something that is in a way a kind of substitute for having a very long but relevant period of record for the targeted securities, as we would in that way likewise have the possibility of pitting the scheme against reams of data. But then we have to ponder the question of how similar in character the other securities would have to be. We can wonder whether or not our scheme As we go through the optimization process we'll be accumulating data on the three lists of securities: the International, the Mostly-USA and the Sector lists. They are all equities, ETFs at that. ## Other ConcernsIt's was hardly possible to have avoided having some parameters that must be set ex post. Furthermore, if you somehow don't have any scheme-defining parameters to set ex post via some sort of optimization, you still have a Picking a structure is possibly even more fateful and portentous than choosing a parameter value ex post: If you didn't try other structures, others did and you read about that and then decided to go with a particular choice, and that all happened We have a rational basis for setting the walkthrough-procedure-defining parameters, and it's one with which we can get a rough idea of the reliability of the finished product. But the chosen structure is something that we cannot be reasonably called upon to vary in any systematic way, and so it is not within the reach of optimization. ## The Idea in Some DetailIt's time to go ahead and find the best performing values of the range limits and the characteristic time period m of the trailing exponential average The concept here is that if the Sharpe ratio for the entire period of record fluctuates radically as we vary a parameter that we have set ex post about its optimal value, such as a range limit or the characteristic time period of the trailing exponential average, then that means that we could generate a wide range of outcomes of the simulation (but not repeatably in practice) by changing the parameter values just a bit. And of course if we're stupid we'll go ahead and tweak the parameter values, notwithstanding the volatility, until we get a high Sharpe ratio— failing to understand that high historical volatility with respect to parameter values implies high volatility of future outcomes with fixed parameter values, and fooling ourselves into thinking that we have a scheme that really works (a bit like the painting contractor all over again). But conversely, if there is only an insubstantial variation of the Sharpe ratio as we vary range limits and the characteristic time period over reasonable ranges then our concern is minimized. Yes, we want insensitivity or non-volatility of the Sharpe ratio with respect to the values of our dangling, loose-end parameters and that's one circumstance with regard to which we can rationalize the acceptance of ex post settings of such parameters. If we don't get that then it simply means that the second haircut will have to be correspondingly deep. More ideally we'd like to select the loose-end parameters so as to maximize the Sharpe ratio for the entire current time period under study, nearly 20 years for our International portfolio, and do likewise for numerous other time periods of like substantial duration— with the very same securities if only that were possible. We would presumably get somewhat different optimal walkthrough-procedure-defining parameter values for each time period and could then substitute each of them for the optimal values for the current time period and see what damage the deviations would do to the Sharpe ratio of the current time period. It would go lower, or at least never higher to be sure, because we had maximized the Sharpe ratio for the current period. However, as we discussed above, we don't have many decades of relevant data with respect to any one set of securities. That not being feasible, the resort that is carried out in this article is, in place of data from different time periods that pertain to the same list of securities, the use of contemporary data from several lists of similar kinds of securities. We find the optimal walkthrough-procedure-defining parameter values for each list of securities, and note the extent to which the Sharpe ratios for each of the lists of securities fluctuate as the parameter values are adjusted about their optimal values. We can then, from the variations in the optimal values, get a rough idea of how much the projected Sharpe ratio for each list of securities should be reduced— our second haircut. (Continued...) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

## Optimization Underway at LastWe'll now finally go about optimizing the walkthrough-procedure-defining parameters, the range limits and the characteristic period m of the trailing exponential average, that we use to select optimum n and lp parameter values in real time in the course of our walkthrough. What happens when these parameters' values are varied?
Let's begin with varying the range limits of n, the maximum number of securities held— at first just for the International portfolio. For the complete walkthrough run with initial values, the results of which are shown on Figure 3, the range limits were the widest possible, 1 and 10, where there are 10 securities on the list. We had considered that anyone managing funds might steer away from n being allowed to be very small. After all, a single ETF, even a country fund, might suffer a dramatic collapse (e.g., utter loss of computerized records, geopolitical catastrophe, etc.). And even if calculations were to show that allowing n to be very small produced superior results, it would be difficult to convince clients or customers that putting all of the eggs in so very few baskets was a good idea. So 3 was made the lower limit of the n range, leaving 10 as the upper limit. The Sharpe ratio improved, which improvement continued steadily as the lower limit was further raised until 5 was reached, after which the Sharpe ratio declined, albeit not very much. Note that if the lower limit is increased all the way to 10 then the run will be the same as for the magenta trace of Figure 3, which was still a quite good result. That behavior with regard to the lower limit would appear to be rather consistent with the fact that fixing n at 5, with no variation, produced the best results, the stellar results (blue) of Figure 2 which were previously called into question. But the difference is that here by moving the lower limit to 5 months we are still allowing n to range over 5-10 months, not fixing n at 5. And we also confirmed that the resultant Sharpe ratio was not dependent upon the lower limit for n in a volatile way in the vicinity of n=5. We might think that setting the lower limit of the range back to 1 and the upper limit to, say, 2, which is the complement of the 3-to-10 range, might be expected to produce results somewhat worse than those for the complete walkthrough run (cyan) of Figure 3 with n allowed to vary over the full range of 1-to-10. It's not really necessarily so, but indeed that was the outcome. But the really big picture is that with the lower limit of n fixed at any of the possible values other than one or two the Sharpe ratio would not have been bad, not with this dataset. So we could simply adopt the attitude that at least with these particular kinds of securities we have some leeway to move the limits of the n range to meet other money management objectives. With the Sector fund portfolio the circumstances are rather different, with only rather high values of n seeming to work. It was found that with the lower limit of n raised continuously from 1, the peak value for the Sharpe ratio was reached at 13 or 14, with the ratio declining only a little and with little volatility with still higher values for the lower limit of n. However at lower values of the lower limit of n the ratio was rather volatile with respect to small changes in n and was often about half the peak value. So we would not be able to accept a low range limit for n for the Sector portfolio, and the lower volatility that is associated with the highest n values suggests that the use of a high lower limit may be appropriate. With the lower range limit set to 14, and with 17 remaining the upper limit, the Sharpe ratio was 0.67— somewhat better than for n fixed at 17 as on the table above and for the magenta trace on Figure 3 for the Sector portfolio. (But we're not through optimizing yet.)
The professors have been working on these considerations for some time, as is implied by the quotation above from David Aronson's book. I have added the clarifications in square brackets which draw parallels between the meaning of the quotation and the content of this article. By “rule” he means something like our momentum scheme with a particular set of parameter values. But note that the academicians too are still hashing out the fundamentals. They do not, for the most part, engage in walkthrough simulations per se but use other procedures that may or may not be more rigorous and usually don't leave us with a ready-to-go adaptive system, whereas our walkthrough approach is adaptive as the selected parameters such as n and the lookback period are determined dynamically in real time. Note especially that the discussion above about the lower limit of n for the Sector portfolio illustrates how we have allowed for systematic exclusion of bad parameter values, “rules that are worse than the benchmark” in Aronson's language.
A trailing exponential moving average (EMA) weights more recent data more heavily than older data, and in the runs presented above the characteristic period of the EMA of the monthly return was set to m=2 and used by the walkthrough procedure to determine the n and lp parameter values that were optimal on a trailing basis. For example, with m=2 and monthly data the natural logarithm of the month-over-month return ratio of the prior month will be weighted by a third as much as that of the current month, and for the month before the prior month the weight will be one-ninth of the weight for the current month— with the weight for each month going back in time down by a multiplicative factor of a third each month. With a m=9 the weight for each monthly return is four-fifths of that for the following month, so that the roll-off is much less. The multiplicative factor is computed as (m-1)/(m+1) where m is the characteristic period of the EMA— that's in accordance with the standard definition of an EMA of characteristic period m. So now we go on to consider the effect of trying characteristic periods m for the EMA other than 2 months. Note that although the computation of the EMA begins at the very start of the record, if the characteristic period of the EMA is m months then it will take something like m months before the EMA assumes a value that could not be substantially different had data earlier than the start of the record been available. Hence we do not determine positions in securities that are to be taken using such an EMA until m months have passed. Given the discussion above that concluded that the lower limits of the n ranges should be adjusted upward, let's proceed first with the International portfolio with n confined to the range of 5 to 10. We do not really expect to find that the characteristic period of the EMA that maximized the Sharpe ratio with the full range of n of 1 to 10 will now still maximize the ratio with n varying from 5 to 10, and so we'll see. Varying the characteristic period m from 1 month up to 6 months with n in the new and smaller range and the best performing lookback period still in the 1-to-12-month range produced Sharpe ratios that tended to be rather flat, varying little throughout the entire range of variation of m— except that the ratio for m=2 was about 20% higher than the ratios for 1 and 3, and there was a slow trailing off for m greater than 3 with the ratio not dropping below about 1.7 times that of the benchmark. Turning now to the Sector portfolio (cf. Figure 5), with n now in the range 14-to-17, the findings regarding the Sharpe ratio dependencies on the choice of the characteristic period m of the EMA are qualitatively similar to the International portfolio results. For m in the range of 1 to 6, the Sharpe ratio was above 1.4 times the benchmark except for m=6, with which it was 1.3 times the benchmark. At m=1 it was 1.7 times the benchmark and 20% higher than the ratio at m=2. This behavior is again rather favorable, due to the general lack of volatility, especially if m is kept away from the low end of the initial range, and due to the ratio staying substantially higher than that of the benchmark. And the peak Sharpe ratio occurred at almost the same m value as with the International portfolio. We'll keep in mind that variations as large as 20% have happened with these tests of the effect of varying m, the characteristic period of the EMA. They are bigger than other variations that were observed, and So for the next step, investigating the range limits of the lookback period, for the International portfolio we'll continue to use the 5-to-10 range for n, the maximum number of securities held; for the Sector portfolio we'll continue to use 14 to 17 as the range. And for both we'll now set the characteristic period of the EMA at— surprise— 5 months. It's not the utter optimum for either the International or the Sector portfolio with the lookback period range limits that we've started working with. But we're going to find that it works well, close to optimal, with the adjusted range limits that we're about to settle on next. (And we'll get to the Mostly-USA portfolio for all of these settings a bit further below.)
With those settings the lower range limit of the lookback period was first systematically varied upward with the upper limit fixed at 12; then the lower limit was fixed at 1 while the upper limit was first set to 1 and then systematically raised. We'll start with the International portfolio. The Sharpe ratio increased rather smoothly and continuously as the lower limit was raised from 1 to 4, the value at which the ratio peaked, and then declined just as evenly as the lower limit was further raised to 12 months. The final run with the lower and upper limits of the lookback period fixed at 12 months produced the lowest Sharpe ratio of the series but it was nonetheless 1.7 times that of the Rebalanced Benchmark. Then, as the upper limit was started at 1 and increased to 12, except for a dip of about 15% when the upper limit was at 3 months the Sharpe ratio was very flat, varying little from an average of about 1.5 times the benchmark except when the upper limit was 12 months. That produced a ratio of 1.8 times the benchmark. But what about increasing the upper range limit beyond 12 months? With 1 remaining the lower limit, there was a slight increase in the Sharpe ratio out to 14 months and a slow and steady decline started thereafter. Turning now to the Sector portfolio, with n in the range 14-to-17, as the lower lookback period range limit was increased from 1 to 12 months with the upper limit set at 12 months the Sharpe ratio followed exactly the same pattern as for the International portfolio: rising until reaching a peak when the lower limit was at 4 months and declining thereafter, steadily, and to a value of 1.6 times the benchmark when the lower limit was equal to the upper limit and was 12 months. Proceeding further, still with the Sector portfolio, with the upper range limit starting at 1 and then increasing to 14 months, the Sharpe ratio of the traded portfolio increased rather continuously from 1.2 times the benchmark to 1.5 times it— but for an anomalous jump when the upper limit was increased to 2 months, with which range the Sharpe ratio was 1.6 times the benchmark. So, in all, the tests of altering the lookback period range limits showed modest volatility of the Sharpe ratio with respect to the choice of limits, mostly steady changes, and no big advantage to altering the limits much from the 1-to-12-month range of the tests above. However, the indications are that using instead, say, 4-to-14 months may be a slightly better arrangement, especially since the higher lower limit of the lookback period range may tend to reduce the frequency of trading. So we'll adopt that range for the calculations for the final figure and table of this article, which appear on the next page.
To avoid tedium the behavior of the Sharpe ratio with respect to various settings of the walkthrough-procedure-defining parameters for this portfolio was not reported on above as for the other portfolios but is described here and now and in an abbreviated way. It suffices to say that the same 4-14 month range for the lookback period lp and the same 5-month setting for the characteristic period m of the trailing EMA look to be quite appropriate for this portfolio as well. The Sharpe ratio tendencies with respect to variations of the range limits and the characteristic period were much the same as for the International and Sector portfolios. Also, setting the lower limit of the range of n, the maximum number of securities held, to 5 produced a maximized Sharpe ratio. ## The Worse That Can HappenIn addition to implementing the walkthrough procedure, the p value calculations and taking the aforementioned care with the disposition of the walkthrough-procedure-defining parameters, is there anything else that we have done that might limit the risk that we take if we use this computerized portfolio management program? What's the worse that can happen? Well, a scheme that would have worked so very well in the past could in the future be found to work worse than buy-and-hold investing or being in cash. Yes, that can happen. (Continued...) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

A more likely failure would be that the monthly trading that our program requires becomes merely ineffectual, that in time the null hypothesis rears its ugly, nihilistic head and turns the scheme into nothing better than a random number generator. With that scenario what should we then expect as our likely outcome. Well, with the scheme that we're analyzing we'd be holding securities in our account, long positions only, and from time to time some cash— meaning that we'd be less exposed to risk but also that we'd forfeit some of the gains, if any, that would follow from pure buy-and-hold investing. Since in the long run the equities markets go up, our long-term outcome would likely involve a smaller rate of return but at less volatility. That might well mean that our long-term Sharpe ratio would be similar to that of the buy-and-hold approach, which is not too bad an outcome (but we'd have to watch the transaction costs, which could be more problematical due to the reduced return, and it would have been quite regrettable to have done all of the trading for no improvement). Note especially that the sufferable prospect that has just been described would come about only because our momentum program is specifically limited to long positions only; if we were to also do short sales during down markets then if our program were to degrade into randomness the long-term expectation for returns would be about nil, the short positions proving to be losers more often than not. ## Other Costs of TradingYou may have also read this other article on this website on the subject of significant costs of trading that innately accrue if the trading is done in a stop-loss fashion— selling when the market declines and hoping to buy back at the right price on the way back up. So it needs to be understood that while in this article we have neglected to account for commissions, other fees, slippage on executions and the bid-ask spread, we have That is, our scheme has us selling when the market goes down and buying when it goes back up, at least vaguely like the examples of that other article on pure stop-loss trading. However, our mathematics automatically includes whatever costs of that kind accrue in their entirety, and where we have shown high Sharpe ratios we are handily beating them. It's not especially likely that the restriction that we have imposed with this article to trading only once a month assisted us in avoiding those other costs (the computer was programmed to trade at the close on the first trading day of every month if indicated). We normally expect the per-trade innate costs of stop-loss trading to be proportionately greater with a lower frequency of trading, so we cannot conclude that where we were successful it was made possible by our modest frequency of trading. ## Notes and Finalized ResultsWe've tackled a lot of important considerations in this article. In no particular order, here are some important ones to keep in mind. - This article is about the general problem of how do we test a dynamical portfolio management program to see if it is reliable or not, if we can have confidence in the projected returns.
- Apart from the testing principles that are put forth, all of the findings herein depend on the fact that we have considered a particular type of momentum scheme— one simply based on price ratios over a trailing lookback period. With each variation tested here no security was held if its price ratio was less than the like ratio for cash compounded with interest. Only long positions were taken, no short sales. With all of the calculations interest was paid on cash held, at the going rate implied by the 13-week Treasury bill discount rate, and the returns that were used for the Sharpe ratio calculations were returns net of those on cash. Trades were conducted only at the close on the first trading day of each month but the computations of the positions that were to be assumed at the close were based on opening prices on the first trading day of each month.
- Other schemes would produce other results, so this article's findings don't represent the best that can be shown in support of the advisability of questing for good computerized ways for managing of a portfolio of investments. Really the momentum approach was selected as an example because it has some popularity and some academic support. Another Retail Backtest program that is now in testing performs better but this one has the advantage of a more modest frequency of trading and it would have easily and handily avoided the last two market plunges, the dot-com crash of 2000 and the Lehman Brothers/subprime-housing crisis of 2008.
- It was found that the relative strength approach, which involves limiting the positions held to only the very best performers on a trailing basis, should be applied in moderation and that it's possible to legitimately tailor the degree of moderation to the chosen list of eligible securities somewhat, subject to some limitations, in a way that is responsive to the character of the securities as a class (e.g., International, Sector, etc.). In this article moderation was imposed by increasing the lower limit of the permitted range of n values, where n is the maximum number of securities held. As with the other findings, this one is specific to the momentum scheme that was tested; with other programs relative strength may or may not generally work well if not applied in moderation.
- Should you make use of this program with your own investments? Investment advice is not offered here— just analyses with fulsome disclosures about the results of hypothesis testing.
- Note that the walkthrough method that is employed here, with its wonderful way of giving naïve projections of rosy returns a “haircut”— the blue trace of Figure 2 was superseded by the cyan trace— and the computation of the chance that the thus-shorn returns are due in reality to chance don't overtly deal with the problem of our having simply chosen a scheme of a particular structure, which is an ex post choice (and we hate those because they make us look like the hapless, clueless painting contractor of the first page of this article). To deal with that problem some have theorized about alternative hypothesis testing methods, near counterparts to which we could implement here by somehow coming up with a “universe” of other schemes having different structures that could hypothetically be tried during the walkthrough just as we instead tried just our own scheme with different parameterizations. In theory that might administer a bigger haircut, but it's not a feasible approach given that the number of such other schemes is truly infinite and given that it's hardly understandable that even partial use of such a procedure would not generate a “type-II” error as defined in the David Aronson quotation in the right-hand column of the previous page. In lieu of that, in this article the emphasis has been on running as true a possible a simulation of the performance of the program in the past.
- The imperfections of the simulation take the form of unavoidable static ex post settings of the n- and lp-parameter range limits and of the characteristic period m of the EMA that are used during the walkthrough to select the n and lp parameter values with the best trailing performance. Those static ex post settings of the walkthrough-procedure-defining parameter values are then tested regarding the universality of their applicability by investigating the sensitivity of the Sharpe ratio to them and the sensitivity of their optimal values to the makeup of the list of eligible securities. That testing is the basis of the second haircut,
*which we will assume below to amount to a 20% reduction of the annual excess return*, the return in excess of that on cash. The haircut was applied in such a way as to not reduce the volatility of the cumulative returns. Estimating the size of this second haircut and applying it is now called “suboptimization”, in these pages of this website.
## Charts and a TableAnd now let's see how our results were affected by the adjustments of the previous page to the characteristic period m of the trailing exponential average and also to the range limits for the lookback period lp and for n, the maximum number of securities held— and by administering that second haircut.
The table above summarizes our results after finalization of the scheme. Note especially that all of the traded portfolios sport good Sharpe ratios that substantially exceed the benchmark. And the p values show hardly a hint of the null hypothesis. It is mainly the fact that we set lower range limits for n, the maximum number of securities held, that were not low— we had used n=1 before— that caused our “complete walkthrough” results to improve, besting the Sharpe ratios of the benchmark and also of the walkthrough when no relative strength was used at all. In other words, moderation with regard to the application of relative strength seems to work better than either full use of it or utter abandonment of it. Other improvement happened as the natural result of adopting ex post lookback period range limits and a characteristic period for the trailing exponential average that were rather close to being simultaneously optimal for all three portfolios. This research will be continued. Although the discourse above seems to be at least conceptually complete— every grim reality of hypothesis testing is dealt with in some way— there is clearly work yet to do if anyone wants to use the scheme to manage portfolios of substantially different composition, or if modification of the scheme is required to meet particular objectives. Interested parties are encouraged to write to see if they can be assisted. — Mike O'Connor Comments or Questions: write to Mike. Your comment will not be made public unless you give permission. Corrections are appreciated. Update Frequency: Infrequent, as this article is for the purpose of showing certain principles of portfolio management program testing in action; it's not intended to show the current effectiveness of the program or state of the market. See the momentum items on the Performance menu for the current program performance. |