A couple of years ago we wrote an article that finished a little something like this;
*DISCLAIMER – whilst we’re hopeful of being correct, thus acquiring fortune and numerous appearances on breakfast television, this is mainly in jest. Please read with a pinch of salt.*
The disclaimer was for a piece where we tried to predict the results of the 2017 UK General Election using Google search data. In the face of multiple polls suggesting otherwise, along with a handful of Twitter trolling, we predicted a Hung Parliament.
Oddly enough, we ended up being right.
Rather unfortunately, it turns out the amateur polling world isn’t as lucrative as we first thought. Buckets of cash and a visit to Holly and Phil’s sofa weren’t on the cards.
What did emerge, however, was a murmur of intrigue around the methodology our approach used. If it did indeed have legs, it would add an interesting and very different voice to the media circus.
So, after our spectacular entrance to the political polling world, here we are with our, likely ill-advised, sequel. Let’s hope we’re a little more Godfather than Jaws.
Finding the Swing Seats
Last time out, we tried to predict the result of every seat in the country by looking at search trends in key regions alone and extrapolating outwards. For this year’s effort, we took a more rigorous approach; going a floor deeper and focusing our attention at a constituency-level.
Despite how much free time we may appear to have, completing such a depth of analysis across 650 micro-regions was beyond our comprehension. Try doing that during Black Friday. That’s why this time we decided to focus on potential swing seats only. Whilst we were correct with 2017’s end result, the addition of specific seat allocations to the mix will hopefully bring even greater accuracy, with all of our time and effort being spent in the granular details of key battlegrounds.
Our first challenge was to define where these elusive swing seats lay. Rather than taking the easy route and asking someone, we chose the data-driven option (as we tend to do).
Step one was straightforward enough. We took the results from the 2010, 2015 and 2017 elections and built a picture of the voting history for each constituency. Any seat that had swung during those last three events were included within our analysis.
The next stage was to then look all those that were left. How do we calculate the potential instability of all remaining seats, some of which haven’t changed hands in over a decade?
For this, we modelled the average seat swing for each party in every constituency over the last three elections. The fluctuations over time gave us an idea of seat volatility which we then applied the following formula to:
All those with a value below 1 were added to our list of potential swing candidates, leaving us with a total pool of 209 seats to go forth and predict using our latest algorithm update.
We learnt from our past efforts that the prediction formula should take into account three separate elements. The candidate in the borough, the parties themselves, and respective party leaders.
We started by sourcing the previous months number of searches for every local candidate in those 209 seats from Google’s Keyword Tool. The number of impressions were considered their vote share. This was then weighted against searches for each party and party leader, leading to the following formula:
What we ended up with was a value considered to be their share of the electorate’s vote in any given seat. Whichever local candidate had the highest share was thus predicted to be our victor. These numbers were then added to the seats unlikely to swing.
Makes sense in principle, but how did it stand up in practice? Sadly limitations in search term data restricted our testing to the 2017 election only. The output however was extremely encouraging.
Upon running the formula, we ended up with this:
A 4% margin of error, or a 96% confidence level. A positive sign for any pollster.
We settled on our approach and applied it to the battle to come.
In an election where almost everyone appears to be queueing up to lose, we have our winner. A very slender Conservative majority. Regrettably, not quite the bombshell that we landed with last time around.
There are however three things that are very interesting. The first is the predicted performance of the Liberal Democrats. In a campaign where the whole notion of ‘tactical voting’ appears to have taken root, it doesn’t seem to have worked in the way that we had anticipated. Rather than reducing the Conservative majority, the Lib Dems are predicted a surge of seats from 21 to 35 at the expense of Labour; implying that anti-Brexit voters are sticking firm. This hasn’t been predicted to this scale anywhere else.
The second is the drop in seats for the SNP – a sign perhaps that the ‘indyref2’ campaign message hasn’t resonated with voters in the region? Whatever the cause, Nicola Sturgeon will likely see this as an opportunity missed if results do indeed end up this way.
The final point to mention is some of the shock swings our analysis has revealed.
We have Sheffield Hallam switching to the Conservatives (following a slim Labour victory in 2017), Southport opting for the Lib Dems (after the 2017 Conservative victory), and perhaps most surprisingly Wimbledon moving into the hands of the Labour Party (a Conservative hold for over a decade). If the latter in particular were to happen early on in election results night, it could see pollsters scrambling to add a couple more notches of red to their bar charts.
And there we have it. Phase two complete. Whether we’re hoisted aloft and buried in awards (like The Godfather II), or quickly forgotten and relegated to the annals of history (like Jaws II), only time will tell. All we do hope for is that we last a little longer on your screens than a picture of an ill child being waved in front of a Prime Minister.