SocArXiv submission rule changes

Context

SocArXiv is experiencing record high submission rates. In addition, now that we have paper versioning – which is great – our moderators have to approve every paper revision. As a result, our volunteer workload is increasing.

In addition we are receiving many non-research, spam, and AI-generated submissions. We do not have a technological way of identifying these, and it is time-consuming to read and assess them according to our moderation rules.

We also don’t have moderation workflow tools that allow us to, for example, sort incoming papers by subject, to get them to specific expert moderators. So all our moderators look at all papers as they come in. That encourages us to think about narrowing the range of subjects we accept.

The two rule changes below are intended to help manage the increased moderator burden. More policy changes may follow if the volume keeps increasing.

1. ORCID requirement

We require the submitting author to have a publicly accessible ORCID linked from the OSF profile page, with a name that matches that on the paper and the OSF account.

In the case of non-bibliographic submittors (e.g., a research assistant submitting for a supervisor), the first author must have an ORCID. We can make exceptions for institutional submitters upon request, such as journals that upload their papers for authors.

At present we are not requiring additional verification or specific trust markers on the ORCID (such as email or employer verification), just the existence of an account that lists the author’s name. It’s not a foolproof identity verification, obviously, but it adds a step for scammers, and also helps identify pseudonymous authors, which we do not permit. We may take advantage of ORCID’s trust markers program in the future and require additional elements on the ORCID record.

We are happy to host papers by independent scholars, but a disproportionate share of non-research, spam, and AI-generated submissions come from independent scholars, many of whom do not have ORCIDs. For those scholars with institutional affiliations, we urge you to get an ORCID. This is a good practice that we should all endorse.

2. Focus on social sciences

At its founding, SocArXiv did not want to maintain disciplinary boundaries. It was our intention to be the big paper server for all of social sciences, and we couldn’t draw an easy line between social sciences and some humanities subjects, especially history, philosophy, religious studies, and some area studies, which are humanities in the taxonomy we use, but have significant overlap with social sciences. It was more logical just to accept them all.

As the volume has increased, this has become less practical. In addition, a lot of junk and AI submissions are in the areas of religion, philosophy, and various language studies. We also don’t have moderators working in arts and humanities, and our moderators trained in social sciences are not expert at reviewing these papers. Finally, there is an excellent, open humanities archive: Knowledge Commons (KC Works), which is freely available for humanities scholars. With approval from that service, we will now direct authors to their site for papers we are rejecting in arts and humanities subjects.

We continue to accept papers in education and law, which are also generally adjacent to social science.

For a limited time we will accept revisions of papers we already host in arts and humanities, but urge those authors to include links to Knowledge Commons or somewhere else that can host their work in the future.

We will assess papers that include arts/humanities as well as social science subject identifiers, and if we determine they are principally in art/humanities, reject them.

We will continue to host all work we have already accepted.

When SocArXiv gets bad papers

Detail from AI-generated art using the prompt “bad paper” with Wombo.

Two recent incidents at SocArXiv prompted the Steering Committee to offer some comment on our process and its outcomes.

Ivermectin research

On May 4, 2021, our moderators accepted a paper titled, “Ivermectin and the odds of hospitalization due to COVID-19: evidence from a quasi-experimental analysis based on a public intervention in Mexico City,” by a group of authors from the Mexican Social Security Institute, Ministry of Health in Mexico City, and Digital Agency for Public Innovation in Mexico City. The paper reports on a “quasi-experimental” analysis purporting to find “significant reduction in hospitalizations among [COVID-19] patients who received [a] ivermectin-based medical kit” in Mexico City. The paper is a “preprint” insofar as the paper was not peer reviewed or published in a peer-reviewed journal at the time it was submitted, but because it has not subsequently been published in such a venue, it is really just a “paper.” (We call all the papers on SocArXiv “papers,” and let authors describe their status themselves, either on the title page, or by linking to a version published somewhere else.)

Depending on which critique you prefer, the paper is either very poor quality or else deliberately false and misleading. PolitiFact debunked it here, partly based on this factcheck in Portuguese. We do not believe it provides reliable or useful information, and we are disappointed that it has been very popular (downloaded almost 10,000 times so far).

This has prompted us to clarify that our moderation process does not involve peer review, or substantive evaluation, of the research papers that we host. From our Frequently Asked Questions page:

Papers are moderated before they appear on SocArXiv, a process we expect to take less than two days. Our policy involves a six-point checklist, confirming that papers are (1) scholarly, (2) in research areas that we support, (3) are plausibly categorized, (4) are correctly attributed, (5) are in languages that we moderate, and (6) are in text-searchable formats (such as PDF or docx). In addition, we seek to accept only papers that authors have the right to share, although we do not check copyrights in the moderation process. For details, view the moderation policy.

Posting a paper on SocArXiv is not in itself an indication of good quality. We host many papers of top quality – and their inclusion in SocArXiv is a measure of good practice. But there are bad papers as well, and the system does not explicitly differentiate them for readers. In addition to not verifying the quality of the papers we host, we also don’t evaluate the supporting materials authors provide. In the case of the ivermectin paper, the authors declared that their data is publicly available with a link to a Google sheet (as well as a Github repository that is no longer available). They also declared no conflict of interest.

We do not have a policy to remove papers like this from our service, which meet submission criteria when we post them but turn out to be harmful. However, we could develop one, such as a petition process or some other review trigger. This is an open discussion.

Fraudulent papers

To our knowledge, the ivermectin paper is not fraudulent. However, we do not verify the identities of authors who submit papers. The submitting author must have an account on the Open Science Framework, our host platform, but getting an OSF account just requires a working email address. OSF users can enter ORCID or social media account handles on their profiles, but to our knowledge these are not verified by OSF. OSF does allow logins with ORCID or institutional identities, but as moderators at SocArXiv we don’t have a way of knowing how a user has created their account or logged in. Our submission process requires authors to affirm that they have permission to post the paper, but we don’t independently verify the connections between authors.

In short, both OSF and SocArXiv are vulnerable to people posting work that is not their own, or using fake identities. The unvarnished truth is that we don’t have the resources of the government, the coercive power of an employer, or the capital of a big company necessary to police this issue.

Recently, someone posted one fraudulent paper on SocArXiv, and attempted to post another, before we detected the fraud in our moderation process. The papers submitted listed a common author, but different (apparently) fake co-authors. In one case, we contacted the listed co-author (a real person) who confirmed that they were not aware of the paper and had not consented to its being posted. With a little research, we found papers under the name of this author at SSRN, ResearchGate, arXiv, and Paperswithcode, which also seem to be fake. (We reported this to the administrators of OSF, who deleted the related accounts.)

It did not appear that these papers had any important content, but rather just existed to be papers, maybe to establish someone’s fake identity, test AI algorithms or security systems, or whatever. Their existence doesn’t hurt real researchers much, but they could be part of either a specific plan that would be more harmful, or a general degradation of the research communication ecosystem.

With regard to this kind of fraud, we do not have a consistently applied defense in our moderation workflow. If we suspect foul play, we poke around and then reject the papers and report it if we find something bad. But, again, we don’t have the resources to fully prevent this happening. However, we are developing a new policy that will require all papers to have at least one author linked to a real ORCID account. Although this will add time to the moderation process of each paper (since OSF does not attach ORCIDs to specific papers), we plan to experiment with this approach to see if it helps without adding too much time and effort. (As always, we are looking for more volunteer moderators — just contact us!)

User responses

We do offer several ways for readers to communicate to us and to each other about the quality of papers on our system. Readers may annotate or comment on papers using the Hypothesis tool, or they may endorse papers using the Plaudit button. (Both of these are free with registration, using ORCID for identification.) If you read a paper you believe is good, just click the Plaudit button — that will tell future readers that you have endorsed it. Neither of these tools generates automatic notifications to SocArXiv or to the authors, however — they just communicate to the next reader. If you see something that you suspect is fraudulent or harmful, feel free to email us directly at socarxiv@gmail.com.

We encourage readers to take advantage of these affordances. And we are open to suggestions.