When SocArXiv gets bad papers

Detail from AI-generated art using the prompt “bad paper” with Wombo.

Two recent incidents at SocArXiv prompted the Steering Committee to offer some comment on our process and its outcomes.

Ivermectin research

On May 4, 2021, our moderators accepted a paper titled, “Ivermectin and the odds of hospitalization due to COVID-19: evidence from a quasi-experimental analysis based on a public intervention in Mexico City,” by a group of authors from the Mexican Social Security Institute, Ministry of Health in Mexico City, and Digital Agency for Public Innovation in Mexico City. The paper reports on a “quasi-experimental” analysis purporting to find “significant reduction in hospitalizations among [COVID-19] patients who received [a] ivermectin-based medical kit” in Mexico City. The paper is a “preprint” insofar as the paper was not peer reviewed or published in a peer-reviewed journal at the time it was submitted, but because it has not subsequently been published in such a venue, it is really just a “paper.” (We call all the papers on SocArXiv “papers,” and let authors describe their status themselves, either on the title page, or by linking to a version published somewhere else.)

Depending on which critique you prefer, the paper is either very poor quality or else deliberately false and misleading. PolitiFact debunked it here, partly based on this factcheck in Portuguese. We do not believe it provides reliable or useful information, and we are disappointed that it has been very popular (downloaded almost 10,000 times so far).

This has prompted us to clarify that our moderation process does not involve peer review, or substantive evaluation, of the research papers that we host. From our Frequently Asked Questions page:

Papers are moderated before they appear on SocArXiv, a process we expect to take less than two days. Our policy involves a six-point checklist, confirming that papers are (1) scholarly, (2) in research areas that we support, (3) are plausibly categorized, (4) are correctly attributed, (5) are in languages that we moderate, and (6) are in text-searchable formats (such as PDF or docx). In addition, we seek to accept only papers that authors have the right to share, although we do not check copyrights in the moderation process. For details, view the moderation policy.

Posting a paper on SocArXiv is not in itself an indication of good quality. We host many papers of top quality – and their inclusion in SocArXiv is a measure of good practice. But there are bad papers as well, and the system does not explicitly differentiate them for readers. In addition to not verifying the quality of the papers we host, we also don’t evaluate the supporting materials authors provide. In the case of the ivermectin paper, the authors declared that their data is publicly available with a link to a Google sheet (as well as a Github repository that is no longer available). They also declared no conflict of interest.

We do not have a policy to remove papers like this from our service, which meet submission criteria when we post them but turn out to be harmful. However, we could develop one, such as a petition process or some other review trigger. This is an open discussion.

Fraudulent papers

To our knowledge, the ivermectin paper is not fraudulent. However, we do not verify the identities of authors who submit papers. The submitting author must have an account on the Open Science Framework, our host platform, but getting an OSF account just requires a working email address. OSF users can enter ORCID or social media account handles on their profiles, but to our knowledge these are not verified by OSF. OSF does allow logins with ORCID or institutional identities, but as moderators at SocArXiv we don’t have a way of knowing how a user has created their account or logged in. Our submission process requires authors to affirm that they have permission to post the paper, but we don’t independently verify the connections between authors.

In short, both OSF and SocArXiv are vulnerable to people posting work that is not their own, or using fake identities. The unvarnished truth is that we don’t have the resources of the government, the coercive power of an employer, or the capital of a big company necessary to police this issue.

Recently, someone posted one fraudulent paper on SocArXiv, and attempted to post another, before we detected the fraud in our moderation process. The papers submitted listed a common author, but different (apparently) fake co-authors. In one case, we contacted the listed co-author (a real person) who confirmed that they were not aware of the paper and had not consented to its being posted. With a little research, we found papers under the name of this author at SSRN, ResearchGate, arXiv, and Paperswithcode, which also seem to be fake. (We reported this to the administrators of OSF, who deleted the related accounts.)

It did not appear that these papers had any important content, but rather just existed to be papers, maybe to establish someone’s fake identity, test AI algorithms or security systems, or whatever. Their existence doesn’t hurt real researchers much, but they could be part of either a specific plan that would be more harmful, or a general degradation of the research communication ecosystem.

With regard to this kind of fraud, we do not have a consistently applied defense in our moderation workflow. If we suspect foul play, we poke around and then reject the papers and report it if we find something bad. But, again, we don’t have the resources to fully prevent this happening. However, we are developing a new policy that will require all papers to have at least one author linked to a real ORCID account. Although this will add time to the moderation process of each paper (since OSF does not attach ORCIDs to specific papers), we plan to experiment with this approach to see if it helps without adding too much time and effort. (As always, we are looking for more volunteer moderators — just contact us!)

User responses

We do offer several ways for readers to communicate to us and to each other about the quality of papers on our system. Readers may annotate or comment on papers using the Hypothesis tool, or they may endorse papers using the Plaudit button. (Both of these are free with registration, using ORCID for identification.) If you read a paper you believe is good, just click the Plaudit button — that will tell future readers that you have endorsed it. Neither of these tools generates automatic notifications to SocArXiv or to the authors, however — they just communicate to the next reader. If you see something that you suspect is fraudulent or harmful, feel free to email us directly at socarxiv@gmail.com.

We encourage readers to take advantage of these affordances. And we are open to suggestions.