Australian researchers say they have developed a mathematical model to predict genocide. A Swiss sociologist has sifted through a century of news articles to predict when war will break out — both between and within countries. A Duke University lab builds software that it says can be used to forecast insurgencies. A team assembled by the Holocaust Museum is mining hate speech on Twitter as a way to anticipate outbreaks of political violence: It will be rolled out next year for the elections in Nigeria, which have frequently been marred by violence.
What makes these efforts so striking is that they rely on computing techniques — and sometimes huge amounts of computing power — to mash up all kinds of data, ranging from a country’s defense budget and infant mortality rate to the kinds of words used in news articles and Twitter posts.
None of this has yet produced a perfect crystal ball to foretell mass violence — and for good reason. “Events are rare, data we have is really noisy,” said Jay Ulfelder, a political scientist who is developing a web-based early warning system to forecast mass atrocities. “That makes it a particularly hard forecasting task.”
But social scientists are getting better at anticipating where trouble might start — or as Mr. Ulfelder put it, “assessing risks.” That explains why the United States intelligence community has been exploring the field for years. The government’s Political Instability Task Force, which Mr. Ulfelder helped to run for over a decade, tries to predict which countries are likely to witness civil unrest in the near term. Its data is not public, nor is information on how the government uses its predictions.
By now, of course, data tracking is pretty much embedded in our daily lives. Amazon tries to anticipate what we can be tempted to buy based on what we’ve bought — or even considered buying — in the past. Google tries to predict what we’re searching for. Political parties in the United States and abroad are devising new tools to predict who will vote for whom. And police agencies worldwide are increasingly turning to analytics tools to forecast when and where crime is likely to occur.
Predicting mass violence is yet another frontier. Among these efforts is a 2012 project funded partly by the Australian government in which a team from the University of Sydney looked at more than a dozen variables that could point to the likelihood of mass atrocities: Had there been political assassinations or coups; were there conflicts in neighboring states; is there a high rate of infant mortality? (Infant mortality turns out to be a powerful predictor of unrest, a signal that state institutions aren’t working.)
Using machine-learning tools to draw inferences about the effects of each piece of information they analyzed, the researchers compiled a list of 15 countries facing the highest risk of genocide between 2011 and 2015. Central African Republic, which had been on no one’s radar at the time, came out at the top, followed by the Democratic Republic of Congo and Chad. Also on the list were some obvious contenders with continuing strife: Somalia, Afghanistan and Syria. They didn’t get everything right: Sri Lanka was on the list, but has witnessed no outbreaks of mass violence since 2011 — not yet, anyway.
Ben Goldsmith, a professor of government at the University of Sydney who led the Australian project, acknowledged that predictions of this nature were more often likely to be wrong than right, since “on average these terrible things happen less than once per year since the 1950s.”
In some cases, bigger, faster computing power has made forecasting possible. A sociologist in Zurich, Thomas Chadefaux, set out to predict when and where war would break out. He combed news stories, week by week, from 1902 to 2001 in Google’s enormous database of newspapers and looked for words and phrases that signified tension: words like crisis, clash, combat, shell. He produced a mathematical model that he said could predict when war was likely to break out between nations a year in advance and within nations, six months in advance. “Wars rarely emerge out of nowhere,” he concluded.
Kalev H. Leetaru, a computer scientist based at George Washington University, has constructed a huge trove called the Global Database of Events, Language and Tone. It scours the Internet to catalog news coverage of major events from 1979 to the present. It can be used to study what might happen in the future — or to produce a snapshot of what’s happening now, as in the case of a map that Mr. Leetaru produced to show outbreaks of violence in Nigeria.
Whether any of this fortunetelling will actually help stave off violence is another matter. That ultimately has less to do with mathematical models than with the calculus of power.
A handful of projects are trying to deploy predictive tools in real time. Michael Best, a professor at the Georgia Institute of Technology, helped develop a tool for Kenyan elections last year that mined reports of political violence on Twitter and Facebook. Nigeria has agreed to let the researchers sit in the election security headquarters when its voters go to the polls next year: They will mine social media for hate speech, using automated tools, and combine the results with the findings of election monitors on the ground.
Social media speech cannot pinpoint violent outbreaks, Professor Best cautions, nor would it be ethical to censor what Nigerians post online. But words are like smoke signals, he argues, and he hopes they can help the authorities get to the right place at the right time.
Somini Sengupta is the United Nations correspondent for The New York Times.