Statistics, Science, Random Ramblings

A blog mostly about data and R

If a paper you cite was retracted you will probably never know

Posted at — Jul 5, 2022

I recently stumbled over a preprint describing that even retracted papers receive a lot of citations. We can probably all agree that this is a problem.

It seems reasonable that in the vast majority of cases researchers do not cite retracted papers on purpose. So, what leads to these papers being cited?

You do not know the paper has been retracted

A scenario in which a paper is saved to a literature management software and then read and cited later is very plausible. At the moment I have a four digit number of papers sitting in my library and I will probably never know whether a paper has been retraced or corrections have been published. I think it is also reasonable to assume that the journal you eventually submit to does not check your citations.

There is no good way to find whether a paper has been retracted in an automated way

After becoming aware of the problem my initial thought was that it should be straightforward to write a tool that queries APIs for metadata retrieval like from Pubmed or Crossref. However, while both APIs are happy to report the metadata of retracted articles, there is no mention of these articles being retracted. There is the Retraction Watch Database which is a very useful tool, but lacks the possibility of being queried using an API (and there are also no public dumps of the database contents). While the software Zotero apparently has an integration of the data, such valuable data should really be more easily accessible.

Information about retractions and corrections must however become a standard part of article metadata, this would enable both periodic checks in all literature management software or dedicated tools that check your citations before you submit a paper to a journal.

Retracted articles articles should not appear in-between non-retraced articles

I was really surprised that both Google Scholar as well as Pubmed will return retracted articles as search results right next to regular articles. The former has no indication of the status of the article, while the latter at least includes an annotation of the result and a box with red text and background if you go to the detail page of the result.

However, why treat retracted articles the same as regular ones? While I do not think that retracted articles should be entirely removed from databases and journals' websites, it should be explicit opt-in when searching for literature.

In a similar fashion, the design of journals' websites should really be more explicit when it comes to retracted articles. In times where you need to deal with cookie consents and javascript popups, a subtle red banner at the top of the website will not draw the attention it needs.


Overall, retraction of papers is probably a very rare event. However, it should be made easier to avoid citing retracted papers. Increased publication speeds during the Covid-19 pandemic most certainly amplified the issue quite a bit, but nonetheless this should be addressed.

The easiest solution would be to include information about retraction in article metadata. This should be straightforward and probably solve most cases of accidental citations of retracted papers.