Using the Wisdom of Crowds to Improve Intelligence Forecasts

Over the last decade research revealed that combining the predictions of many experts produces intelligence forecasts that are better than a single analyst can produce. Our team from the UBC Master of Public Policy and Global Affairs program explored how the Canadian Security Intelligence Service (CSIS) could benefit from using these ideas as part of our graduate project. The project was commissioned by CSIS to help the Service optimize the intelligence and security priority-setting process.  

One of our recommendations derives from the impressive literature about aggregative contingent estimation, popularized in the book Superforecasting (2015), by professor of political science Philip Tetlock. The American intelligence research and development arm of the federal government, IARPA (Intelligence Advanced Research Projects Activity) funded Tetlock's research seeking ways to improve intelligence quality. The methods are likely to make intelligence forecasts reliably more accurate. Using these methods, teams of ordinary people predicted global events more accurately than American intelligence analysts who had access to classified information. The methods may genuinely spark a modest paradigm shift in intelligence work. No, they are not a crystal ball, and global events remain chaotic and often beyond prediction, but they do reliably improve predictions.

The foundational theory behind the aggregative contingent estimation is pretty simple. In 1906, statistician Francis Galton discovered how aggregating the guesses of many ordinary people could result in shockingly accurate predictions, which has become known as the Wisdom of Crowds. Galton made his key discovery at a ‘guess-the-weight-of-an ox’ competition at an agricultural fair - he watched as a crowd of regular people competed to guess the weight of a live ox. Some lucky or smart observer guessed closest and won the prize, but that was not the interesting part to Galton. 

The fascinating thing is Galton requested to see all the guesses of the Ox’s weight, expecting the crowd average to be embarrassingly bad. To his surprise, when he averaged all of the guesses, the collection of people guessed the ox’s weight was 1,197 lbs. The true weight of the ox was one pound off at 1,198 lbs. Elitist assumptions shaken, he reported back that “the result seems more creditable to the trustworthiness of a democratic judgement than might have been expected” (Surowieck, 2004). Indeed, Galton found that aggregating and averaging the information of many individual estimates can ultimately converge on extremely accurate predictions.

Intelligence depends on making predictions about events to inform security priorities. As it turns out,  like the ox-guessers, many heads are better than few when it comes to predicting the future. The idea is to apply the wisdom of crowds by aggregating the predictions of many experts and/or skilled non-experts alike. The intelligence estimate produced will be improved beyond the ability of solo analysts.

The method Tetlock created involves recruiting large amounts of forecasters and directing them to compete in large prediction tournaments. The predictions are all aggregated and the guesses of all participants are tracked to identify the best performers. The forecasts are quantified because participant forecasters ascribe numerical probabilities to the chance of x happening within set time frames. 

In these forecasting tournaments, predictions are recorded in a database and are tracked over time as events unfold. High-quality forecasts are revealed as many predictions accrue. Participant teams are issued what is called a brier score which rates the accuracy of the prediction. When the scores are tracked good teams emerge, their practices can be studied over time, and their predictions can be weighted preferentially by decision-makers setting intelligence priorities.

Studying top performers led to the development of a training regime. The pillars of the training regime: 1) awareness of cognitive biases, 2) applied statistical and Bayesian reasoning, and 3) coaching on critical thinking. One hour of training before one of the original tournaments resulted in a 14% increase in accuracy for the average non-expert participant. 

Canadian intelligence analysts independently studied using similar tracking metrics performed well, however they may be able to further improve if they leverage or use aggregation methods (Mandel 2014). These methods leverage the almost supernatural predictive power of the wisdom of crowds. They show great promise for Canadian intelligence. We recommend CSIS incorporate some or all of them into their operations, whether in-house or through expert contractors. Anyone whose work requires predicting events, whether in security or business, should consider learning about them. Not doing so may leave you at a permanent disadvantage.

This article was jointly authored by Easton Smith, Nicolas Jensen, Yahe Li, and Daniel Park as a part of their Global Policy Project in 2021. The research worked on improving the policy planning process at CSIS pertaining to the challenge of balancing priorities between short-term threats to life and long-term strategic threats.

References:

Mandel, D. R., Barnes, A., & Richards, K. (2014). A quantitative assessment of the quality of strategic intelligence forecasts. Toronto, Ontario: Defence Research and Development Canada.

Surowiecki, J. (2004). The Wisdom of Crowds. Abacus Books.

Tetlock, P. E., & Gardner, D. (2015). Superforecasting: The art and science of prediction. London: Random House Business.

Previous
Previous

A Shattered Trust: Nomadic Displacement Communities in Haiti, Post-2010

Next
Next

Arts Undergraduate Society Is Turning a New Leaf, and It’s Green!