Puzzles about (mis)Replication
So, a while back, I was using a new compactness metric to extend some gerrymandering studies. In attempting the replication, I found some minor math errors in the original paper that made it diffcult to get valid values for the statistic.
After trying multiple times to verify whether it was my code or if the published statistic had some typographical error, I went to the original author, shared my concerns, and found that it did. Armed with the original code, the replication was easy.
Now, I’m working on implementing some Gibbs samplers for hierarchical linear models with spatial effects. And, even though I’m incredibly interested in the topic, I’ve been finding it incredibly frustrating replicating the paper my grant has targeted.
At the outset, my grant obtained the code the authors used. After really sitting down and groking it, there are numerous discrepancies between the math that the code implements and the math that is published in the paper. In some cases, the code generates the correct result, and in other cases, the paper describes something that the code doesn’t implement correctly.
Is the paper non-replicable because they take a logarithm incorrectly? If you sample from the conditional posteriors in this unnamed paper, you will not get samples from the correct joint distribution. If you use the techniques in the code, you will get the results they published, but will find that, in other contexts, some of the conditional posteriors are inconsistent for the true parameter.
So, after 8 months, I’ve “replicated the paper,” since I’ve transliterated their code. But, does this really replicate the paper?
I think what I’ve done so far is yet another instance of the “compilation errors” Terence Tao refers to as common among graduate students.
Maybe, if I had
- read the paper as an artifact of human knowledge, possibly inconsistent or errorful,
- understood the mathematical object the authors propose
- derived its properties for myself and understood their implications
- implemented it how I think it should be implemented
I might’ve saved myself a ton of frustration and wasted time.
This leads me to wonder really, what is replicability in social science? What does replicability look like when papers fail to compile? I strongly doubt that a focus on “science in a box” will solve this, even though it’s quite important to make sure that what you implement to do your science is, at least, repeatable.
Maybe in response to these experiences, I think the real crux of replicability is actually validity: if you do something that’s broadly in line with the theoretical, empirical, and statistical thesis of the paper, you should get similar results.
imported from: yetanothergeographer