- Image duplication has been a powerful signal for detecting scientific fraud, but is irrelevant in many fields. I've been working a bit on finding new signals like it that work across fields. I've found one using LLMs that can predict retractions, weakly, for $1 per paper. 1/4
- From a few independent studies, 15-25% of papers show signs of faked results. There are many caveats here, but we do know that reproducibility of papers is below 50% and may be related to this. LLMs offer some chance of automating this analysis. 2/4Feb 19, 2025 13:26
- What is a universal way to check for signs of fraud in a paper? I investigated faithfulness of citations – are citations consistent with cited sources, are they irrelevant? This does significantly correlate with if a paper is subsequently retracted 3/4
- This is still an early topic and my work is very preliminary, but I think we may be able to start auditing scientific literature at scale. I’ve written up a lot of thoughts, background, and analysis in a blog post: diffuse.one/p/d1-008 4/4