## The Kelly Criterion— Maximizing a Gambler's or Investor's Most-Likely Final Amount of Wealth## A Case for KellyThere is, for example, such a thing as a “listed stock option”, of which there are two types. For present purposes we could say that a stock option is a bet that the price of a particular stock will be either above or below a stated price, the “strike price”, by a given expiration date. If the option pays off when the stock finishes Now there are some options that don't expire for, say, two years. And “warrants” are very similar to call options and at issuance they may be set to expire more than a decade hence. But the most liquid options contracts are those that expire within about three months, and most of the action in the options marketplace is with options whose strike prices are not very different from the current market price. That means that the odds of winning at least something or of getting nothing back for the premium on any given bet are usually something like 50-50, most often not outside of, say, 70-30 either way. Given the short times until expiration and those odds, an options “investor” could hypothetically make many, many such bets in a career, each of them posing a very substantial risk of getting back nothing for the premium. We immediately see the problem. If some night at the casino you want to guarantee that you'll be retiring early you can just put everything on black at the roulette table and let it ride. You'll have lost all in at most a few spins of the wheel. The stock option investor can't very well “let it ride”, put up everything on the option contract each time and hope to survive. So how much should the investor be willing to pay out as premium each time? What fraction of his capital? Well, the famous “Kelly criterion” determines a formula for the optimum size of each bet in a given set of gaming circumstances with the goal being to maximize the growth rate of the accumulated wealth over the long run. The purpose of this article isn't to develop some scheme for options trading; it is to explain the Kelly criterion. And to figure out what the Kelly criterion is all about is to understand that if the most-likely rate of growth of an investor's equity is to be maximized then the fraction of his equity that should be risked at any one time is often, especially in the circumstances of retail investors, considerably less than 1.0 . That is not what, say, mutual fund managers typically do. They are usually obliged to remain as close to 100% invested as possible and, as we shall see, there is some virtue in that. The concern about what fraction of equity to risk on a risky asset arises when simply holding stocks and even with the now-popular approach of investing in Exchange Traded Funds (ETFs). True, it's a much milder concern than is the case with options, and the concern is particularly mild with the ETFs because with such securities there is generally no chance of ever finishing absolutely out of the money, of suffering anything like a total loss of the amount put up. But the fact remains that the compounding of returns on investments that are not risk-free does not proceed in quite the same way as the compounding of risk-free investments and that untoward outcomes might be ameliorated by paying some attention to the mathematics of compounding. In part due to its pure emphasis and applicability only to outcomes after many, many trials, which generally means to long-term outcomes, the Kelly criterion is hardly in use by investment advisors and portfolio managers who allocate money to stocks, bonds, ETFs and the like. Not only are they held accountable for their performance annually, not over the long term, and not only do their clients have limited time horizons, but we'll also see that in order to effectively apply the criterion some statistics on the future performance of the securities must be rather well known in advance. And those statistics are never well known (possibly they are if the game is Blackjack, but not if it's stock market investing). Most such advisors and managers therefore rightfully disregard Kelly's observation entirely. In all, although professionals who conduct many, many transactions through the years might benefit by paying some heed to the Kelly criterion, the truth is that the criterion will ## Kelly's CriterionIt's not too much of a reach to refer to the purchasing of a stock option as a “bet”. Kelly was not a gambler but although he developed his formula while working constructively on information theory for the improvement of electronic communications his published article did in fact demonstrate application to gambling— to parimutuel betting on horse races. So the Kelly criterion has also been applied to gambling and the greatest need for knowing about it is with regard to all such risky endeavors. The Kelly criterion requires the computation of an “expectation value”. Where a quantity has a set of possible values the expectation value of that quantity is the arithmetic average of those possible outcomes, weighted in proportion to the theoretically-known likelihood of occurrence of each. In general, some particular value might be vastly more likely to occur than any others yet not be so much as close to, let alone equal to, the expectation value. Let us consider that an investor begins with a given starting wealth and does nothing else with it but use it to repeatedly assume a position of some size in a given security. For example, the investor could at regular intervals— e.g., every week, month or year— adjust the amount committed to the security with the rest being held as cash. Or, a gambler could repeatedly bet on a particular game of chance. And let us further assume that the individual persists through such a large number of “trials”— that's what the statisticians call them— as to, in effect, fully encounter the entire distribution of possible outcomes for each trial, possibly many times over. And for each trial we can compute an overall return ratio: the total amount of wealth at the end of the trial divided by the total at the beginning, which definition does not preclude the funds put at risk each time being by choice only a fraction of the available wealth. The Kelly criterion determines the fraction of the wealth at the beginning of each trial that must be committed each time in order to maximize the average rate of growth toward the final wealth amount over many trials. We will see that this maximization amounts to maximizing the expectation value of the Yes! The I am aware of this article by Samuelson and Merton. The latter had been Samuelson's student and later became a promoter and director of Long-Term Capital Management, which he helped “blow up”. The article is in part an attack on the use of the Kelly criterion as a potential cornerstone of portfolio management. Perhaps the authors' real concern is about guidance for retirement plan portfolios and the like; the authors are probably not talking about whether or not it could be OK for some venturous hedge fund to commit some portions of the assets of their “accredited investors” to strategies that are in some way modulated by the use of the Kelly criterion. That is to say that I have only skimmed the article and do not intend to finish reading it. Early on the authors seem to commit to imposing utility functions on investors so as to compel them to assert their own risk tolerance in particular ways. It would not be surprising to find that these authors merging the Kelly math with their own particular utility function math might produce untoward outcomes. I might have gotten further in the article had I not encountered, on the fourth page, a “thoughtful person” invoked as a component of the argument. But the authors do, commendably, in their second paragraph, admit to the gross failures of “mean-variance” models, which are today known as modern portfolio theory (MPT) and which are still foisted off on investors by many firms and advisors. With schemes that fully implement the criterion come levels of risk that can be formidable, especially in the early going. That does not refer to the early going of your efforts to understand and correctly apply Kelly. Rather, with Kelly perfectly applied, account equity can dive towards zero before recovering. But you can moderate the risk by being less aggressive and accepting sub-optimal rates of growth. And we'll soon see how the mathematics of Kelly helps us with that decision. p class="indent">When it comes to developing expectations for real-life circumstances such as actually trading in stocks or stock options, nothing can be done that is very accurate and so great care must be taken to assess whether or not the resultant trading scheme is likely to have any reliability to it at all.You have to do proper hypothesis testing, which happens to be the business of Retail Backtest. “Wheels of Fortune” are depicted on this page. A truly random wheel of fortune game doesn't present any of the complications of securities and the theoretically-expected distribution that we need to know in order to implement the Kelly criterion is printed on its face. That's what we'll consider next.
Mike O'Connor is a physicist who now develops and tests computerized systems for optimizing portfolio performance.
Note: The wheels are used with the kind permission of Mr. Poundstone. (Continued...) | ||||||||||||

## The Book “Fortune's Formula”It's by William Poundstone and it's about the Kelly criterion and the characters who gave it life. The title is from an article by Edward O. Thorp, a renowned mathematician and hedge fund manager who used the Kelly criterion in both gambling and investing with great success. It's a worthwhile book overall, a lively one, one for which this web article is no substitute (owing in part to its utter failure to reference gangsters and ponies). But I'd have to say that the book isn't quite going to suffice if you want to learn the mathematics of the Kelly criterion so as to be able to apply it to anything: the book is written so as to be readable by the general public; for full understanding integral and differential calculus is needed, albeit mainly just calculus of a single variable. Thorp's articles are the main place to go for the mathematics but you will find an introduction to the math, one that avoids most of the difficulties, on page 3 of this article under “A Bit of the Mathematics”. This article is related in part to a particularly important section of the book, one in which the Kelly criterion is discussed in relation to three wheels of fortune— “The Trouble with Markowitz” section in Part Three, “Arbitrage”. There's one wheel for each of three penny stocks, each with its own possible outcomes. A spin of a wheel is taken to simulate the outcome of a $1 investment in a penny stock over a holding period of a year. The wheels are shown on the first page of this article and it may be helpful if you open that page in another tab or window of your browser for access as you read this page. The numbers on each wheel are the possible dollar values of your initial dollar investment in the penny stock at the end of the year. In the book the idea is to see which wheel is the best one, from the point of view of a Kelly investor versus one who does not heed the effects of compounding the returns of risky investments. Those wheels of fortune fairly cry out for the JavaScript-powered widgets that I have provided. The widgets allow you to spin a wheel of your choice yourself, very rapidly and many times in succession— take that, Vanna White. Of course JavaScript must be at least temporarily enabled on your browser for any of it to work. All of the calculations are done on your own computer. While preparing the JavaScript I became puzzled by one key paragraph in that section of the book. It is in order to be able to effectively clarify the meaning that I have taken the liberty of re-using the very same wheels that Mr. Poundstone used (actually I have his kind permission). My understanding comes about in part from having actually applied the Kelly mathematics to the given wheels. Here's the paragraph in question:
Whether the player repeatedly bets an optimal amount as determined by the Kelly criterion or simply uses the let-it-ride approach, the geometric mean after n spins of the wheel is the positive real number which when multiplied by itself n times produces the ratio of the player's final wealth to his starting wealth. So a geometric mean of zero would mean that the player lost everything; the bigger the geometric mean the better; should a geometric mean of 1.00 ever happen that would mean that in the final analysis there was no change in the player's wealth notwithstanding the ups and downs along the way. If we seek to determine the geometric mean that is characteristic of a particular wheel by experiments on it rather than by reading the numbers that can come up off of the face and and making certain simple theoretical assumptions, then n has to be a very large number in order for the experimentally-determined geometric mean to nearly equal the theoretical value. If furthermore a let-it-ride policy is assumed then the experimentally-determined geometric mean will converge to the value for each wheel that is shown in the book. The quoted paragraph of the book is basically true. It seems that the author had in mind the usual practice of stock market investors which is in effect to let it ride and is simply saying that if that is to be the policy then the general theory behind the Kelly principle immediately leads to the understanding that wheel #2, with it's let-it-ride geometric mean of zero, should be utterly avoided. However, when the Kelly criterion is actually employed so as to adopt an optimal bet size the second wheel performs for the Kelly investor about as well as the third, with the return ratios having a decidedly non-zero geometric mean thanks to his having bet only a fraction of his wealth each time. Certainly the second is not an utterly bad wheel notwithstanding zero being one of the outcomes and we could easily make it better than the third by tweaking the non-zero returns upward while its let-it-ride geometric mean would remain zero. It's all because the Kelly criterion compels the investor to ## We Spin the Poundstone WheelsHow do we see all of that about the second wheel? When the first page of this article loaded all three graphs on the right above the Poundstone wheels (or at the bottom of that page if your screen is not of sizable width) were initiated using the possible payouts of wheel #2; otherwise you can simply click on the image of any wheel to initiate the graphs with the distribution of that particular wheel. The first graph shows a single possible history of trading using the distribution of the chosen wheel. The “Spin the Same Wheel Again n Times” button does what is says and you should press it numerous times and whenever you please as that will allow you to see how wildly the outcomes can vary from one history to another. The option of a ridiculously long trading period of 300 years is offered that we might get a glimpse of the long-term trend which is otherwise almost indiscernible within the 30-year view due to the volatility of the outcomes and the fact that the frequency of the trials is only once per year. Click-dragging within any of the graphs so as to zoom in is sometimes very helpful; just double-click to zoom back out. The We'll instead focus on wheels #2 and #3 on which we see numbers less than 1 that present losses. The Kelly approach comes into play only when losses are possible. For such wheels the conjoined second and third charts inform us that we should consider adopting a “Betting Fraction” from the horizontal axes of those charts, “f” in the common notation, that's greater than zero but less than or equal to the f that has the highest “Annual Geometric Mean” as shown on the second chart. Why confine ourselves to that range of f values? Because outside of that range the geometric mean of the return is less while the risk as represented by the standard deviation is greater. The optimal betting fraction that maximizes the geometric mean is usually denoted in Kelly literature by f*. If you're not following the f business, if f is 0.5 then we keep half of our money as cash and bet the rest. Per se, fixed-fractional betting, always using the same f, was no invention of Kelly; it's old hat. However the basic Kelly idea does employ fixed fractions and there are theorems that support the use of fixed fractions in conjunction with awareness of the Kelly criterion. Back to our wheels #2 and #3, with other wheels, or stocks, fractions below or above the zero-to-one range of f might be feasible and would respectively represent selling the stock short or borrowing money to buy an excess of it. Given the “Average Payouts” of the Poundstone wheels, all of which exceed 1.00, none of them would show a long-term profit with short selling. And for wheels #2 and #3 it turns out that boosting your bet with borrowed money would be either ill-advised or catastrophic, but the story could be different with some other wheel such as #1 or even with a wheel that would occasionally present a loss. Let's look at the second chart in detail, with wheel #2 selected. We see a sort of inverted, lopsided horseshoe curve having a maximum at a betting fraction f of about f*=0.63, which yields a maximum geometric mean of 1.24— it helps to zoom in, even twice if you wish, in order to pick off the utter maximum. So to get the fastest rate of growth of our wealth we would bet 63% of our wealth on wheel #2 each time. Had you previously understood that there are investments that pay off when investing only a fraction of the funds that you have available but are sure losers if you simply commit nearly 100%? Read on! Wheel #2 is like that. If we go off to the right, settling on a higher betting fraction f > f*, not only does our geometric mean deteriorate— at about f=0.96 it goes below 1.00 which means that beyond that we would be Now if we go off to the left of the maximum geometric mean with f < f* then things are qualitatively different. True we also have to settle for a reduced geometric mean, but the risk Still on wheel #2, let's examine the top chart, which is based on a single history of a succession of wheel outcomes over either 30 years or 300 years, your choice. On that chart are plotted two wealth histories that share that single wheel-outcome history— one for let-it-ride trading and one for Kelly-optimal-betting-fraction trading. If you spin the wheel several times you'll see that, oddly and rather inappropriately, the red plot for let-it-ride often stops abruptly at some year short of 30. About one out of every six times there's only a dot at the beginning. The cause of that is the fact that with wheel #2 the wealth of the let-it-ride investor often goes to zero but the logarithm of zero is minus infinity which can't be plotted on that chart because the vertical scale is logarithmic. Hence the chart usually fails to show a complete let-it-ride history. I much prefer to plot the wealth histories on a logarithmic scale to better show that they look somewhat like straight lines, which they should, at least over the 300-year span notwithstanding the volatility. Furthermore, changes of a given percentage are represented by the same vertical distance on a logarithmic chart, anywhere on the chart; not so on a linear scale. But we still need a fix. So, to get the fix you simply And so now the label will say “A History for And before we leave wheel #2, we can ask what amount should accumulate from the geometric mean of 1.24 with f at the Kelly optimal value f* that we previously found. The answer should be 1.24 We can now quickly go over wheel #3 as it produces qualitatively similar results when used with the optimal Kelly betting fraction, which for it is f*=0.75— only a bit bigger than the optimal fraction for wheel #2. But this time, since there is no chance of losing utterly everything that is put up on a single spin it would be at least possible to use borrowed money— all the way up to about f=1.5, at which point the geometric mean has declined to about 1.00, beyond which there would be losses. But as with wheel #2, fractional Kelly or full Kelly with 0 < f <= f* is the preferred range of bet sizes with nothing beyond f* ever being advisable. And especially note that the top chart confirms that use of the Kelly optimum generally beats let-it-ride and at less risk, with let-it-ride this time showing a profit. That you can easily see with repeated spins of wheel #3 on the 300-years scale. And finally, if we consult our second chart to see what the geometric means are for f=1.00, the let-it-ride case, for each of the Poundstone wheels, then we see that they all agree with the values that are given in the book. The book compares the Kelly emphasis on the geometric mean with the reliance of “mean-variance” analysis upon the arithmetic mean, with regard to assessing the relative attractiveness of the wheels. “Modern Portfolio Theory” (MPT) and specifically the “Capital Asset Pricing Model” (CAPM) are theories that are based upon mean-variance analysis. Inasmuch as they involve schemes that use diversification to maximize returns at given levels of risk they are intended to be applied to portfolios and not to single issues, and in a way that is very dependent upon correlations among the price performance histories of the individual issues. But no such correlations exist among the three wheels so that an uncompromised application of mean-variance analysis to them is not possible. Since MPT/CAPM practitioners manage portfolios none would ever plan to put all of the assets into a single security. Hence if there were a single security in one of their portfolios that had the possibility of becoming worthless that would not lead to the ruination of the portfolio. And if a security has a multi-period “average payout” substantially greater than 1, as with wheel #2, then it might actually be reasonable to include such a security in an MPT- or CAPM-managed portfolio in spite of it presenting the possibility of a total loss. The reasonability would follow from the fact that were the security the likes of wheel #2 or not, then surely only a certain small fraction of the assets would be assigned to it— portfolios are generally policy-limited to a small fixed range of permissible position sizes to guard against the risk of any one issue going belly-up. So the circumstances of any one issue in such a portfolio differ little from what we have called fixed-fractional betting with the use of a very small fraction. The Annual Geometric Mean plot, the second chart on page 1, goes through zero at f = 0 and if you work it out the calculus shows that the slope there is the arithmetic average payout (the “mean” of mean-variance) minus 1— not influenced by the geometric mean at f = 1. Thus any such minimal successive exposures to the risks and rewards of securities that performed like wheel #2, whose average payout exceeds 1, would ultimately be profitable notwithstanding the zero geometric mean at f = 1. The chief distinction then is that none of the mean-variance models make any allowance whatsoever for the mathematics of the subsequent and inevitable compounding. It's a dimension that they do not incorporate. Doesn't the theory of the Kelly criterion then however suffer in comparison with mean-variance analysis for its neglect of correlations within portfolios? Well, no, not really. For example if there are p non-risk-free issues in a portfolio we could assign “betting fractions” \(\scriptstyle\text{f}_1,\, \text{f}_2\ldots\,\text{f}_p\), one to each security, where \(\scriptstyle\text{f}_1+ \text{f}_2\ldots\,+\,\text{f}_p =\,\)f and with the fraction 1 - f being committed to a risk-free security or cash. And then we could vary the \(\scriptstyle\text{f}_k\) so as to find the values that maximize the logarithm of the final wealth, just as we do for single issues with just one f. Note that if we were talking about investing in a single security then the let-it-ride mode that we have discussed would actually be the same as “buy and hold” with 100% invested. Does that sound more familiar? Let's now go on to understand how we calculate the dependence of our final wealth upon the betting fraction f. (Continued...) | ||||||||||||

## A Bit of the MathematicsIf you didn't immediately comprehend the expectation-value-of-the-logarithm business on page 1 of this article... you could be normal. It can be hard to find in readily available Kelly literature anything much that is properly instructive as to how the logarithm actually comes about. Various authors insist on bring up utility functions and fail to make it clear that you're not entitled to a choice of utility functions, not if you want to maximize the rate of growth of your wealth; it's the logarithm, nothing else. Here I'll try to fully explain the mathematics behind the Kelly criterion because it is, at base, rather simple. And it helps that the wheel-of-fortune setup with which we started is really rather generally applicable, such as to stocks or stock options or even to funds that hold them. Where Mr. Poundstone talked about penny stocks that had six equally-likely outcomes he also pointed out that for realism we could simply add more outcomes and repeat, as he did, the more likely outcomes. ## CompoundingBy “\(\equiv\)” in the equation immediately below is meant “is defined to be”; \(X_i\) is the wealth of the investor after i spins of the wheel; \(X_0\) is the starting wealth; \(X_n\) is the final wealth if there are n spins in all. The numerator of each fraction is canceled by the denominator of the next fraction, but no numerator can be zero else we must terminate the sequence right then and there with the investor utterly broke.
\begin{aligned}
(\text{geometric sample mean})^n&\equiv\frac{X_n}{X_0}\\
&=\frac{X_1}{X_0}\cdot\frac{X_2}{X_1}\ldots\,\cdot\, \frac{X_n}{X_{n-1}}
\end{aligned}
\begin{aligned}
&(\text{geometric sample mean})^n\\
&\equiv\frac{X_n}{X_0}\\
&=\frac{X_1}{X_0}\cdot\frac{X_2}{X_1}\ldots\quad\cdot\,\frac{X_n}{X_{n-1}}
\end{aligned}
To explain the Kelly criterion we won't have to immediately focus on the geometric mean; we're mainly concerned with the composition of the ratio \(\frac{X_n}{X_0}\). We'll get back to it a bit later as it's fairly often mentioned in Kelly literature, such as in the Poundstone book. ## Letting it RideGiven the sequence of payout numbers \(r_1, r_2\ldots , r_n\) that are the result of n sequential spins of the wheel and are therefore random choices of the numbers \(R_1, R_2\ldots , R_6\) that are printed on the wheel, then in the equation above with let-it-ride betting we must set \(\frac{X_i}{X_{i-1}}=r_i\,\). If any of the \(r_i\)'s turns up zero then the sequence ends and the investor is broke. Although the discussion here continues to refer to the wheels-of-fortune examples, the simple mathematics of this page has ## Fixed-Fractional BettingGiven the same sequence of payout numbers \(r_1, r_2\ldots , r_n\) from the face of the spun wheel then with fixed-fractional betting we would We see immediately that if f = 1 then the ratios for fractional betting reduce, as they should, to the ratios for let-it-ride betting. But if f < 1 then if any \(r_i\) is zero \(\frac{X_i}{X_{i-1}}\) will simply be 1 - f, which will be greater than zero. In that way the investor can be prevented from ever going entirely broke. Of course we are not dealing here with any real-world annoyances such as transaction costs, taxes or dividends, much less policies affecting the use of margin that are in effect at brokerages. But we can see that if f is Fixed-fractional betting is not Kelly betting per se. Of course everyone always knew, before Kelly came along, that you could bet only a fraction of your wealth if you wished and avoid sudden utter ruin that way. ## The Kelly Optimum Betting FractionHere we are actually going to avoid integral and differential calculus and just use some rules involving exponentiation and natural logarithms. So if you have some mathematical inclinations you should be able to follow even if you don't know calculus— we'll just apply the rules. If \(y\) is a positive number then \(\text{log}(y)\) increases as \(y\) increases but not as fast. In fact it has a downwardly concave appearance when plotted as the vertical coordinate with \(y\) the horizontal coordinate, and that concave aspect is crucial for the fulfillment of the Kelly criterion. The logarithm is defined only for positive \(y\) because the value of the logarithm plunges towards negative infinity as \(y\) approaches zero from above; the logarithm of one is zero; \(\text{log}(y)\) is that power of Euler's number \(e=2.718\ldots\,\) that yields \(y\). So \(y = e^{\text{log}(y)}\). Now if we have With that definition and the rule about products we go to work on our first equation above, the one for the all-important ratio of final wealth to starting wealth. We find the following:
\begin{align}
\frac{X_n}{X_0}&= e^{ \text{log}\left(\frac{X_n}{X_0}\right) }\\
\text{log}\left(\frac{X_n}{X_0}\right) &= \text{log}\left(\frac{X_1}{X_0}\right)+\text{log}\left(\frac{X_2}{X_1}\right)\ldots +\text{log}\left(\frac{X_n}{X_{n-1}}\right)
\end{align}
\begin{align}
\frac{X_n}{X_0}&= e^{ \text{log}\left(\frac{X_n}{X_0}\right) }\\
\text{log}\left(\frac{X_n}{X_0}\right) &= \text{log}\left(\frac{X_1}{X_0}\right)+\\
&\quad\quad\text{log}\left(\frac{X_2}{X_1}\right)\ldots\\
&\quad\quad+\text{log}\left(\frac{X_n}{X_{n-1}}\right)
\end{align}
We now focus on that expansion, on the sum of logarithms. Each term can take on only one of six values, each based on a random choice \(R_j\) of the six \(R\)'s from the face of the wheel: $$\text{log}\left(\frac{X_i}{X_{i-1}}\right)= \text{log}\left(1-\text{f} + \text{f}\cdot R_j\right)$$And now comes the easy but profound step... how many are there in the expansion representing each of the \(R_j\) values? We
\begin{align}
\text{log}\left(\frac{X_n}{X_0}\right) &\approx \text{log}\left(\frac{Xp_n}{X_0}\right)\\
&= \text{n}\cdot \left[ \frac{1}{6}\cdot\text{log}\left(1-\text{f} + \text{f}\cdot R_1\right) + \frac{1}{6}\cdot\text{log}\left(1-\text{f} + \text{f}\cdot R_2\right)\ldots\\
+ \frac{1}{6}\cdot\text{log}\left(1-\text{f} + \text{f}\cdot R_6\right) \right]
\end{align}
\begin{align}
\text{log}\left(\frac{X_n}{X_0}\right) &\approx \text{log}\left(\frac{Xp_n}{X_0}\right)\\
&= n\cdot \biggl[\frac{1}{6}\cdot\text{log}\left(1-\text{f} + \text{f}\cdot R_1\right)\\
&+\frac{1}{6}\cdot\text{log}\left(1-\text{f} + \text{f}\cdot R_2\right)\ldots\\
&+\frac{1}{6}\cdot\text{log}\left(1-\text{f} + \text{f}\cdot R_6\right) \biggr]
\end{align}
The notation \(\frac{Xp_n}{X_0}\) with the \(p\) added to the \(X\) has been used to indicate the use of the probability distribution that is defined by the numbers on the face of the wheel and the n has been factored out on the right-hand side. And we recognize the quantity in the square brackets []. It's the expectation value of the log terms, taken over the distribution of the face of the wheel, the “theoretically-expected” distribution. And if we take the logarithm of both sides of the very first equation in the left-hand column at the top of this page then the rule about the logarithm of a product being the sum of the logarithms of each term yields the following: \begin{aligned} \text{n}\cdot\text{log}(\text{geometric sample mean})=\text{log}\textstyle\left(\frac{X_n}{X_0}\right)\displaystyle\\ \text{n}\cdot\text{log}(\text{geometric mean})=\text{log}\textstyle\left(\frac{Xp_n}{X_0}\right)\displaystyle \end{aligned}Then comparing that with the equation above it we see that the quantity in square brackets, the expectation value of the logarithm of the return ratio, is also our best estimate of the logarithm of the geometric mean of the distribution of \(\frac{X_n}{X_0}\). We're supposing here that as n goes to infinity the value of \(\text{log}\left(\frac{X_n}{X_0}\right)\) approaches n times the expectation value of those six log terms. That \(\text{log}\left(\frac{X_n}{X_0}\right)\) is, for very large n, approximately n times the expectation value, the sum of the terms inside the square brackets, means that the square-bracked terms represent the rate of growth of \(\frac{X_n}{X_0}\) with respect to the number of trials n. So as we maximize it by a suitable choice of f we are maximizing the rate of growth of \(\frac{X_n}{X_0}\). We have to stop right here to celebrate the fact that we're essentially done. We have the answer. We only need to compute and sum the terms inside the square brackets and find the f that yields the maximum value for that sum. That's the Kelly fraction f*. We could find it using a computer, just varying f over a wide range and finding the f that produces the maximum value for the square-brackets sum. ## The Central Limit TheoremWe now need to discuss a most important theorem. We are only using the simple, classical version of it. It doesn't matter how the log terms of the theoretically-expected distribution are distributed as numbers. They could be skewed to one side or the other of their average. The theorem says in part, along with the law of large numbers, that the expectation value that we have computed— the sum inside the square brackets which is the logarithm of the geometric mean, when multiplied by n as above, is the best estimator of the mode (most probable), the median (mid-percentile) and the mean (average) of the distribution of all of the \(\text{log}\left(\frac{X_n}{X_0}\right)\) values that might actually happen. That is to say that the mode, median and mean are the same number. And the bigger the n, the better the estimate. In our particular context, this is true whatever the value of f that we choose to use— whether it be the f We should be clear here that while our results and the theorem only pertain to large-n circumstances, and so there is a subtext concerning the greater precision that happens as n is further increased, “the distribution that consists of all of the \(\text{log}\left(\frac{X_n}{X_0}\right)\) values that might actually happen” does The equivalence of the mean, mode and median of the \(\text{log}\left(\frac{X_n}{X_0}\right)\) values is guaranteed because the theorem also states that in the large-n limit the distribution of the \(\text{log}\left(\frac{X_n}{X_0}\right)\) values becomes the famous “bell curve”— when the probability density, the likelihood of particular outcomes, is plotted against \(\text{log}\left(\frac{X_n}{X_0}\right)\) the shape is like that of a bell that is utterly symmetric and centered on the maximum value of the \(\text{log}\left(\frac{Xp_n}{X_0}\right)\) expression that we have just computed— which is, because of that symmetry, at once the mean, the mode and the median. Now we might indeed prefer, instead of the mode/median/mean of the distribution of the \(\text{log}\left(\frac{X_n}{X_0}\right)\) values to get the mode of the \(\frac{X_n}{X_0}\) values. Of course: we want our most likely final dollar amount, not some statistic on the logarithms of the possible dollar amounts. To get the mode of the \(\frac{X_n}{X_0}\) values you have to subtract the variance from the mode/median/mean of the distribution of \(\text{log}\left(\frac{X_n}{X_0}\right)\) values. So the most likely final \(\frac{X_n}{X_0}\) value is less than the \(\frac{X_n}{X_0}\) of the most likely \(\text{log}\left(\frac{X_n}{X_0}\right)\) value. ## Kelly on KellyIt's simple to say what Kelly did but you may prefer to read his article. Unless you're already profoundly committed to horse racing you'll find the section “The Gambler With a Private Wire” to be of first importance because it does not involve the more complicated case of parimutuel betting. (Where he wrote “G ## Other ReadingThere is a blog post on the subject of the degree to which the most-likely outcome, the mode of the \(\frac{X_n}{X_0}\) values, can fall short of the most likely \(\text{log}\left(\frac{X_n}{X_0}\right)\) value which the Kelly criterion maximizes. The simple formula for computing the mode of the \(\frac{X_n}{X_0}\) values is given here and also here. This matter is really unrelated to the Kelly criterion. That is, the facts concerning the lognormal distribution are applicable whatever the choice of the betting fraction f. — Mike O'Connor Comments or Questions: write to Mike. Your comment will not be made public unless you give permission. Corrections are appreciated. Update Frequency: Infrequent, as this article is about the principle of the Kelly criterion and not about the current state of the market. |