# Forecasting the California Governor Primary

The California Governor Primary yesterday, June 5th, drew great interest when John H. Cox finished 2nd, and much more closely than many expected, behind Gavin Newsom. California has been heavily dominated by the Democrat Party for a number of years. The “jungle primary” system creates the possibility that two candidates from the same party may compete against each other in the general election as the top two vote getters, regardless of party, face each other. So, for Republicans, the candidacy of Republican Cox, represents a rare opportunity to field a competitive candidate for Governor of California.

**Misunderstanding the Margin of Error (MOE)**

Pre-election polls are often viewed with skepticism by the voting public and political candidates. Polls taken before elections often have a high percentage of “undecides” or voters who, according to pollsters, have not decided who they are going to vote for. The large number of undecided voters, combined with a fundamental misunderstandings about the Margin of Error (MOE) and its relationship to the bell curve of statistical error relationship to the MOD, create widespread misunderstandings about the meaning of any given polls. Let’s take a look at those confusing elements below using the California Governor primary.

**YouGov California Governor Primary Poll**

The YouGov poll, taken from 5–12–18 to 5–24–18 had the following results with a +/- 4% MOE.

Gavin Newsom — 33%

John Cox — 17%

Travis Allen — 10%

Antonio Villaraigoso — 9%

John Chiang — 8%

**Accounting for the Margin of Error (MOE)**

Looking at the above numbers, it would be easy to conclude that Newsom was well ahead of Cox. Subtracting 17% from 33% you might think that Newsom has a 16% lead over Cox. However, it’s not that simple. Why? The Margin of Error. The YouGov poll had a MOE of +/- 4%. Considering the MOE we need to modify those poll numbers above to encompass what the numbers are actually saying.

Newsom — 29–37%

Cox — 13–21%

Allen — 6–14%

Villaraigoso — 5–13%

Chiang — 4–12%

**The Bell Curve**

Even after we account for the poll MOE, we still need to consider statistical error. Typically a poll will have a 95% or 99% level of confidence, depending on the sample size. Most published polls will have a 95% level of confidence. That means that, in theory, 95% of the time the results (let’s just say if the election were held the day the poll ends) would fall within the range defined by the MOE. For example, the YouGov poll indicates Newsom was expected to receive 29–37% of the votes cast. The level of confidence is 95%. Despite that, 5% of the time — assuming the poll is properly constructed, provided that the poll reflected the correct balance between Republicans, Democrats, independents and others, if some major event doesn’t take place that causes voters to switch to another candidate, and any other possibility you can imagine — the percentage will fall outside of the bell curve.

**Murrey not Murphy**

Murphy’s Law states that “Anything that can go wrong will go wrong.” Certainly there are times with polling and political campaigns that Murphy’s Law seems to be in full effect. In order to prevent that pesky Murphy from visiting and messing with the results of the polls it helps to think in terms of poll numbers as a range rather than a single number or even a single number modified by the MOE. An excellent way to determine a probable range based on poll numbers and MOE is to utilize Murrey Math. I first ran across Murrey Math in connection with investing. Murrey Math essentially allows you to see areas of support and resistance in relationship to price — making it ideal for traders who make decisions to buy or sell based on numbers, aka price.

Murrey Math also translates nicely to polls which, because they are just numbers, behave in exactly the same manner as prices do. By layering on numbers indicated by Murrey Math onto our previous polling numbers and MOE, we take the first step towards creating our Probable Ranges. Probable Ranges help clarify probable percentage ranges for each candidate. Why would we want to do that? Let’s consider this. The YouGov poll shows Newsom at 33% and Cox at 17%. It seems unlikely that such exact numbers could possibly be correct. As a first step to correct that we accounted for the MOE and came up with some initial ranges — Newsom 29–37% and Cox 13–21%. That’s better, but is that all we can do? Is that the best that we can do?

**Initial Probable Ranges Based on Murrey Math**

Below are the YouGov poll numbers, adjusted for MOE, along with the initial probable ranges based on Murrey Math.

Newsom — 29–37% ==> 28.125–37.5%

Cox — 13–21% ==> 12.5–21.875%

Allen — 6–14% ==> 4.6875–14.0625%

Villaraigosa — 4–13% ==> 3.125–14.0625%

Chiang — 4–12% ==> 3.125%-12.5%

At first glance, this may seem less useful. We gravitate towards the seemingly clean, clear and precise poll numbers. The above ranges are much wider. Newsom has a probable range from 28.125–37.5%, but Chiang’s probable range of 3.125–12.5% is a very wide range. 12.5% is four times larger than 3.125%. Despite the above, and other perfectly reasonable objections, the above probable ranges have some commendable properties. First off, they help us order the candidates from first to fifth place. By putting the numbers in terms of probable ranges we begin to see what outcomes are probable and which ones aren’t. Based on the above ranges we can arrange the candidates probable finishes.

Newsom — First

Cox — Second, Third or Fourth

Allen — Third or Fourth

Villaraigosa — Third or Fourth

Chiang — Third, Fourth or Fifth (and perhaps an outside chance at second)

The above is useful, but limited. But remember, we haven’t considered how the undecided voters will shake out.

**The Undecided Voters and Herd Mentality**

Some voters just can’t seem to make up their mind who they want to vote for. Reporters and many observers make the mistake of assuming that a large percentage of undecided voters means that any candidate can win the race. If we viewed the campaign through the lens of possibility, that might be true. But anything is possible isn’t very useful for me in attempting to use Probable Ranges to determine who will win a given campaign. It’s definitely of no use to a candidate or their campaign attempting to determine how to use valuable resources to target voters and win on election day. Humans, not always, but often tend to go along with the crowd. Some call it the herd mentality. Others might say that people like to back the winner. We see it in sports quite often when fans, whether it’s the Super Bowl or NBA Championship, root for (or against) the team that has the best odds of winning. Voters often do the same thing.

**Perceived Winner Takes Most of the Undecided Vote**

While nothing is 100% guaranteed in life or elections, most of the time, the candidate who leads in the polls at a certain point (for presidential elections this is typically on Labor day) wins the vast majority of the undecided vote. Similar to the sports fans, voters like to be on the winning team. This isn’t always true, but its true far more often than not. It breaks down in a very definite split, but I can’t reveal all my secrets can I?

**A Few Other Clues**

- The undecided vote isn’t what the polls says it is;
- I calculate the undecided vote based on a modification of the Murrey Math based Probable Ranges;
- Not all of the candidates have high enough percentages to calculate a Probable Range because their poll numbers are so low that if you subtract the MOE from their numbers you get negative numbers. You need to consider these candidates when calculating the percentage of undecided voters. Hint: They’re still going to get votes;
- Sometimes, for reasons you can’t anticipate, undecided voters will flock to the candidate who doesn’t lead the polls. You need to account for that possibility. I do that by what I call the Reverse Adjusted Probable Range.

**Final Probable Range California Governor Primary Election**

- Newsom — 39.0625–40.625%

2. Cox — 21.875–23.4375%

3. Villaraigosa — 12.5–14.0625%

4. Allen — 10.9375–12.5%

5. Chiang — 7.8125–9.375%

Interestingly, the 12.5% top range of Allen and the 12.5% bottom range indicated the remote possibility that Allen had an outside chance of outperforming Villaraigosa on election day.

Note that the above numbers were based not just on the YouGov poll, but on the following polls:

- Berkeley IGS — 5–22–18 to 5–28–18 — MOE +/- 3.5%
- Emerson — 5–21–18 to 5–24–18 — MOE +/- 4.2%
- SurveyUSA — 5–21–18 — MOE +/- 6.1%
- YouGov — 5–12–18 to 5–24–18— MOE +/- 4%
- PPIC — 5–11–18 to 5–20–18 — MOE +/- 4.1%

**Alternate Probable Range California Governor Primary Election**

As stated earlier, there is always the possibility — even though it’s not extremely common — that for whatever reason the candidate leading in the polls before the election doesn’t receive the vast majority of the undecided vote. For that reason, I also calculated an Alternate Probable Range which affected the Newsom and Cox ranges and not those of the other candidates. That resulted in the following:

- Newsom — 32.8125–34.375%
- Cox — 28.125–29.6875
- Villaraigosa — 12.5–14.0625%
- Allen — 10.9375–12.5%
- Chiang — 7.8125–9.375%

**Actual Results of California Governor Primary 6–5–18:**

Finally, after all of the calculations and anticipation, we arrive at the actual election day results which were as follows:

- Newsom — 33.4%
- Cox — 26.2%
- Villaraigosa — 13.4%
- Allen — 9.7%
- Chiang — 9%

Despite all of the polling and my use of both the Probable Range and Reverse Probable Range the actual numbers did not quite fit with either range. Newsom’s winning percentage fell within the Reverse Probable Range (the less likely scenario) while the percentage vote of Cox ended up being higher than the Probable Range, but lower than the Reverse Probable Range. Villaraigosa performed within his expected Probable Range. Allen did slightly worse than what was anticipated by the Probable Range and Chiang’s percentage fell within what was expected. **Notably, the order of finish for the top 5 candidates was what the Probable Range forecasted.**

**A Note About Significant Digits and Rounding**

Because someone is probable going to think it, comment about it or say it, let me cut you off at the pass and add a note about significant digits and rounding. Yes, I realize that a percentage like 14.0625% screams “Round me!” I’m not going to do it. Why? Because I have calculated these numbers based on Murrey Math, and these numbers reflect the way Murrey Math would represent those numbers. I prefer to show my work and give a small clue to those who care to do the research into how I came up with these numbers. I prefer openness to attempting to obscure my methods. That doesn’t mean I’m going to reveal all of my calculations either. But, personally, I’d rather leave a few breadcrumbs for those who wish to follow them where they lead than hide the calculations by rounding numbers.

**Conclusion**

Polls, despite the fact that the Margin of Error and statistical error are frequently misunderstood, still provide a useful starting point for calculating which candidates in an election have the greatest probability of winning. In order to make polling numbers more useful you have to account for the MOE and understand that poll numbers represent a confidence level not a certainty. The way polls present percentages creates a false sense of certainty when, in reality, it is more useful to look at the numbers in terms of a Probable Range. Murrey Math, more associated with investing than politics and poll numbers, is a useful tool for calculating Probable Ranges which can help determine the order that candidates will likely place on election day. Another shortcoming of polls is that they don’t effectively deal with the issue of undecided voters. My method helps resolve this shortcoming. Similar to sports fans rooting for (or against) the team anticipated to win the Super Bowl or NBA Championship, voters often (but not always) exhibit a herd mentality. The candidate leading at a certain point in the election cycle (Labor Day for presidential candidates) will, the vast majority of the time, win an out sized percentage of undecided votes. By understanding how polls are constructed, how numbers work, thinking in terms of Probable Ranges rather than a single number like 33%, and accounting for undecided voter herding, we can greatly increase our chances of correctly forecasting who will win an election.