It’s absolutely necessary that we understand the types of data sources being ingested so that we can classify how to evaluate that information.
The jury is still out on the pollsters’ performance in the 2017 UK General Election. Most of the final polls before the vote itself got near to the actual share achieved by the Conservatives. While the Labour Party share was underestimated by many analysts, it would be viable to argue that the pollsters’ overall performance was better than in 2015 and that some polls this time got close to calling the election right.
The technological capability available to analysts is certainly growing all the time. By combining data from a variety of sources and using analytics and artificial intelligence to identify underlying patterns in voting behaviour, data scientists are in theory now able to project how we will vote even before we know it ourselves. Big data can provide a more precise and comprehensive view of likely outcomes without representative samples.
Once researchers were able to use our choice of newspaper or TV channel as one indication of our social status and perhaps our education. Now, we browse the papers online, read many different viewpoints and then watch more on YouTube. All these online interactions are easily trackable and can be integrated and analysed.
Of course, election results remain challenging to get spot on. One big issue is that many voters don’t make up their minds until the eleventh hour. Before the Brexit referendum, 30% didn’t know which way to vote during the final week’s run up and 15% were thought to leave it until they were in the voting booth to decide and far too late to influence anything save an exit poll. And, of course, the referendum involved just a simple choice between yes and no. Added to that there can be issues around the quality of the voter samples and it seems that some of the pollsters may have got the Labour vote wrong this time by adjusting the data too far to make up for the mistakes of 2015.
Whatever the rights and wrongs of the process and the approach taken by the pollsters, the ability of big data analytics to deliver accurate results quickly is growing all the time. The level of precision and the timeliness of the exit polls both in 2015 and 2017 is in itself an illustration of this truth.
However, today it’s increasingly the way the politicians themselves are using big data to get closer to the electorate that is causing both interest – and controversy. By using data from a wide range of sources, but especially social media they can understand the true concerns of each segment of electorate. As one chief data officer puts it, “… your online activity is very similar to voting activity. Online you can hide behind a screen name and nobody is likely to know who you are or what you are doing. When voting you can put your name next to any box without anybody knowing which party you are voting for. It is the same anonymised action so has a large crossover in behaviour.”
If politicians have insight into the real worries of the electorate, rather than setting their own agenda they can focus on these and thereby create real empathy. At best this could be seen as a democratising influence, enabling politicians to pick up on the issues that the electorate are really talking about rather than setting up the framework themselves.
When in one of the TV ‘non-debates’ Theresa May refused to say what the cap would be on how much individuals pay for their social care, she repeatedly said the party would be talking to a cross section of those involved before pinning it down to a figure. Does this perhaps mean that data around social media discussions will also be analysed to reach a decision?
However, the downside here is that, in the wrong hands, all this knowledge could cause politicians and their parties to become manipulative. The alleged involvement of a UK company Cambridge Analytica (which mixes data analysis with psychological profiling) with Brexit – and according to The Observer, several prominent figures including President Trump and Nigel Farage has demonstrated how careful we need to be.
The Observer article suggests that micro-segmenting key targets and then barraging them with highly-tailored messages and news – and even in the extreme, fake news – could actually shape the results of an election. Some of this is speculation, but it does indicate how much care must be taken to separate fact from fiction.
A recent international survey by Talend found that 71% of UK consumers believe fake news is a growing problem, making them ‘distrust the news and data that is available publicly’. This is higher than the corresponding figure for US citizens (62%) and significantly outstrips the German figure (58%). But how can we combat this new phenomenon?
The answer is simple – we need the right data. Due to the sheer volume of information we have at our fingertips, it’s absolutely necessary that we understand the types of data sources being ingested so that we can classify how to evaluate that information.
Since the 2015 election, there has been an increasing amount of software available which puts big data analysis, machine learning and artificial intelligence within reach of a wider range of organisations and skills. We must trust that the new government will use it to further democracy by tapping into our real concerns, rather than encouraging further false fears.