To memorialize the dead, count them & strengthen the case for genocide. – Editor
by International Truth & Justice Project & Human Rights Data Analysis Group, South Africa & San Francisco, January 7, 2019
In advance of the tenth anniversary of the end of the war in Sri Lanka in 2009, HRDAG and the International Truth and Justice Project (ITJP) urge groups inside and outside Sri Lanka to share existing casualty lists, and even more importantly, to go out and record new ones.
“We at least owe the dead the courtesy of collecting their names,” said ITJP Executive Director, Yasmin Sooka. “The scale of human loss is important to quantify and the final list of names which we will collate can also inform the memorialisation process which is key for communities.” Ms. Sooka’s video invitation to participate can be seen here:
A decade after the war ended, nobody knows to the nearest ten thousand how many people died in Sri Lanka in 2009, let alone in the decades before. The aim of this initiative is to use a statistical approach to estimate the probability of a final death toll. ITJP and HRDAG used the same approach recently to estimate the number of surrendees who disappeared at the very end of the war in 2009.
We urge Tamils all around the world in the next few months to speak to their families, their friends, and their neighbors to collect the names of the dead. We have suggested a spreadsheet format to collect the information. Several groups inside and outside the country have already started collecting lists. Recording the names of the dead is a way of collating the available information. And we can use statistical models to estimate how many people are likely missing from the data collected. Don’t worry about duplication! We will take care of the lists. A video explanation of the project can be seen here:
Though the initial focus is on collecting information from the Tamil diaspora, the project is also keen to collate information regarding war related deaths among Sinhalese and Muslims. Existing lists of deaths are especially important, so if you know of one, please contact us.
Information can be sent to HRDAG or the ITJP at: or
Please note the source or sender of the information will be kept confidential.
Click here for a downloadable spreadsheet in English, Sinhala, and Tamil. The spreadsheet provides a template describing what information is required about each death. We have addressed Frequently Asked Questions here.
HRDAG’s work in Guatemala – https://psmag.com/social-justice/using-machine-learning-to-hold-human-rights-abusers-accountable
Forms for collecting information are available in English, Tamil & Sinhalese on the ITJP website.
English – http://www.itjpsl.com/assets/Data-form-english.xlsx
Tamil – http://www.itjpsl.com/assets/Tamil-data-form.xlsx
Sinhalese – http://www.itjpsl.com/assets/Data-form-sinhala.xlsx
Information to be collected – as much as is remembered:
Name (surname in bold)
Write name in English
Date of Birth
How do you know details of death? Eyewitnesses?
Cause of death – shell, Kfir, rounds, beating, torture, suicide, unknown, other (specify)
Alleged perpetrator, if known
Location of death
Father’s name (surname in bold)
Village of birth/upbringing
Source (will be kept confidential)
Organization that collected data, country (will be kept confidential)
Frequently Asked Questions in English
Do I hand write the forms or type when collecting the information?
It’s up to each group to see what works best for them. Sometimes it’s easier to hand write while interviewing people as technology can get in the way of the conversation. However what we need from you is a typed version of the information in a spreadsheet in Tamil and English.
Is it better to collect the information in English or Tamil?
Ideally Tamil. But both is helpful.
Do we include the disappeared and missing as well as the dead?
Include all of them, but it’s important to distinguish between people who are missing or disappeared and those known to be dead.
Does it matter if I record a person as dead and someone else also records him/her?
No, that’s actually quite useful for us. Don’t worry about duplication!
Will we get a copy of the final list of names?
If you contribute, we will give you a merged list of the names of the dead. It will not identify the sources of the information so that we can protect the confidentiality of the data collectors.
Why are you not distinguishing combatants from civilians?
Because at this point, it’s very difficult to be sure who was armed, who was actively fighting, who was a civilian, or in some other status. There are ways of disaggregating the data by age – babies and the elderly are clearly not combatants.
What if someone died of starvation or illness in the war zone?
We are more interested in death by violence rather than conflict related mortality in general. If you want to include people who died from hunger or disease, it is very important to mark these cases clearly as resulting from those causes, separately from people who died in violent events.
What if someone died/disappeared in Manik Farm or in post war detention?
The main thing for us is to date the cases, or at least identify them as post-May 2009. The date of each person’s death or disappearance is very important.
Why are you collecting all the dead – from 1983 onwards?
Because it is theoretically possible to estimate that far back. However, it depends on the quality of the data. We will estimate different time periods, and the further back we get the data, the further back we can estimate.
If there is an existing list what is the best format in which to share it? What if it’s not in a spreadsheet?
There is a spreadsheet template to use. If the data is not in this format, please send what you have and we will process it.
Do you want photos of the dead?
Yes please – not photos of the corpses, but of them person alive as it helps substantiate that they existed, and it can differentiate people with similar names but different photos.
Why are you not collecting Sinhala dead?
It is important to document all the victims. We welcome reports of Sinhala deaths along with Tamil deaths. We will process all of them, including, for example, JVP era deaths.
What is the deadline?
30 March 2019. This date is so that we can process the information for estimates to be published by the tenth anniversary of the end of the war.
Further FAQs answered by Patrick Ball of HRDAG:
Q: What if we can’t count everyone because some families are all killed and there is nobody left? Keep in mind that our method estimates the number of *unobserved* victims along with the names we receive. We are NOT just counting names. My video explains some of this process, and the technical details are in articles in the notes under the video.
From my experience with this method over the last 20 years, it turns out that if we get about 20-30% of the total victims in our lists, we will estimate the total reasonably accurately. Getting 40%, 50%, or more enables us to estimate more finely-grained patterns (e.g., estimating for each year, by victim ethnicity, sex, or age, or by region, or combinations of these characteristics).
But to be clear, our estimate of the total magnitude will be accurate (within a known credible interval, something like a “margin of error”) once we have 20-30% of the victims in the lists. More data will be very helpful, as I explained, but our primary goal can be achieved with less, albeit at the cost of a wider margin of error and less ability to see specific patterns.
In terms of the total being what you expect: there *is* a true number of victims, though we don’t quite know it yet. Our approach will approximate that number, if we get sufficient data. I cannot assure you that the total will be within any specific range — the data and the method will tell us. In my experience from other projects, our advocacy has been best served by seeking the truth.
What if you get fake data?
On the question of falsified names. It’s entirely possible, even likely, that people will give us some falsified names. There are five responses.
First, we would prefer only to accept names from groups that are trying to get good data. The work by known groups helps us to have confidence in the quality of the work.
Second, it’s actually quite difficult to falsify more than a few names in a plausible way. People who make up lots of names tend to fall into patterns in the made-up data, and those patterns tend to be pretty obvious when we start using machine learning tools to review the lists.
Third, I’ve done a lot of testing over the years, and the statistical technique we use is pretty robust to bad data. If people falsified (without being detected) even a few hundred names, the method would inflate the estimate, but not very much.
Fourth, we can simulate the effect of falsified records. If we (or someone we’re debating) suspects that one or another group has falsified records, we can randomly delete some of those records and recalculate the results. If the results are relatively stable (which they are likely to be, see below), then the falsification has not affected our results. If the results change when we delete a bunch of records, we should be more cautious about the findings. We can determine this when we are doing the calculations.
Finally, the risk to the group falsifying the names is high. If we find more than a very small number of likely falsified records, we would have to exclude all the records a group collected. So participating groups risk a great deal by adding fake data.
What happens after March – do you just stop?
No. The end of March is the deadline for calculations to happen in time for the May 2019 anniversary. We will continue to collect data thereafter. This is a long term project.