How to Conduct an Effective IRB Review of Artificial Intelligence Human Subjects Research (AI HSR)

Institutional Review Boards (IRBs) are formally designated independent groups charged with the review and ethical oversight of research involving human subjects. The IRB is composed of knowledgeable experts in various fields to provide guidance to researchers to minimize risks and maximize benefits for research participants. Moreover, the IRB is in place to protect the rights and welfare of human subjects in research projects. IRBs inform their decisions based on the principles of the Belmont Report, and established regulations and policies from the Code of Federal Regulations and Food and Drug Administration (FDA) (if applicable).

IRB oversight has been required for human subject research dating back to 1974¹; however, the terms research and human subjects are often misunderstood and inconsistently applied today. Federal guidelines were altered in 2018 to define human subjects to include, “information about [not just physical interventions and interactions with] a living individual”. Artificial intelligence and machine learning (AI and ML) research involving human data challenges the federal human subjects guidelines stemming from the difficulty in defining “about whom” the data is being collected.

This White Paper is intended to be used as a basis for further discussion. We seek feedback on it to inform future iterations of the recommendations it contains. Our aim is to help IRBs build their capacity as regulatory bodies responsible for protecting human subjects in research. We provide recommendations on how AI HSR can be reviewed and adequately overseen within the current regulatory framework until a more thorough regulatory framework can be developed. We also include a decision tree for human subjects and exempt category four (4) (secondary use) determinations, based off the
Office of Human Research Protections (OHRP) current guidance².

For IRB professionals the questions arise in two realms: Is the activity “human subjects research” and, if yes, does it meet Exempt criteria?

Check out our new White Paper that discusses how to make AI HSR determinations.


  1. Protection of Human Subjects, 39 Fed. Reg. 105, 18914-18920 (1974) (to be codified at 45 C.F.R. [[section]] 46).
  2. United States Department of Health and Human Services (DHHS). “Human Subject Regulations Decision Charts”. Accessed 30 December 2021.

Artificial Intelligence Human Subjects Research (AI HSR) IRB Reviewer Checklist (with AI HSR and Exempt Decision Tree)

IRBs tread lightly when it comes to the oversight of AI human subject research (AI HSR). This may be due to insufficient understanding of when AI research involves human subjects. It may also be in fear of committing scope creep (who’s role is it to ensure responsible and ethical AI in human subjects research?). Admirably, in response, some have proposed the establishment of commercial AI Ethics Committees, while others try to fit AI ethics review into an ancillary review process. Ancillary AI ethics committees either take on the look and feel of a scientific review committee or treat the process like an IBC or SCRO committee. I argue that IRBs can (and should) fit AI HSR within their current IRB framework in many significant and meaningful ways without committing scope creep.

Admittedly, the current framework has limitations, regardless of if it is AI HSR or any other type of research. However, moving AI HSR oversight to an ancillary committee is not an efficient solution for researchers who will still have to navigate their way through the IRB for their projects in addition to these extra bureaucratic hoops. Ancillary AI HSR committees only delay the process to approval and disincentivize compliance. Rather than build a new AI HSR IRB or ancillary review committee, we need to provide and require the AI HSR education/training of IRB administration and remind the IRB of its duty to ensure a relevant experts sit on the Board when reviewing specific research.

While it may be ideal for institutions with no IRB to outsource their reviews, for institutions with a home IRB, there are multiple downsides to outsourcing AIHSR oversight. Below are a few that come to mind:

1)    Cost: The study team may need to plan for additional funding if the review isn’t free (i.e., when it isn’t done in-house). Additional reviews for modifications or annual renewals may be required, which would add to that cost.

2)    Duplication of Effort: An AI Research Review Committee (AIRC) typically acts as an ancillary review to IRB review. However, many if not all of the issues reviewed would parallel IRB review and cause duplication of effort, time and money.

3)    No binding regulatory power: If an AIRC (or any AI ancillary review) has recommended changes to the protocol, the committee likely won’t have any regulatory “teeth”. This means that the researchers will not be required or inclined to comply with their “suggestions”. Additionally, these suggestions may or may not make their way to the IRB unless there is infrastructure established that keeps the two committees “talking to each other”. 

4)    Sustainability: Need to develop a sustainable administrative process for the committee in regard to.

The key to AI HSR ethical review and research compliance oversight is the need to focus on the data. AI/ML largely depends on the model, but more so depends on the data. Therefore, the IRBs focus should be weighted more heavily on the data used to train the model, as opposed to the algorithm/model itself. IRBs are more well suited to address data concerns than technology (though, the technology may require additional risk assessment by the IT department). These issues can be addressed using a quality AI HSR checklist, adequate board member training, and adding an AI and data expert to the review board. Ancillary and commercial AI HSR IRB committees are innovative and helpful in their own unique ways, but none of these address the rudimentary issue at the forefront of AI HSR oversight which is that we have the tools and protections in place already. We simply need to better understand and utilize them.

We have a lot of work to do! I’ve created a Artificial Intelligence Human Subjects Research (AI HSR) IRB Reviewer Checklist to get this dialogue started.

You can find this in the Creative Commons under a Attribution-NonCommercial-ShareAlike license. Please feel free to distribute, remix, adapt, and build upon the material for noncommercial purposes only (modified material must be under identical terms).

Artificial Intelligence Human Subjects Research IRB Reviewer Checklist (with AI HSR and Exempt Decision Tree) © 2021 by Tamiko Eto is licensed under CC BY-NC-SA 4.0. To view a copy of this license, visit

What is Artificial Intelligence Human Subject Research (AIHSR)? Defining “human subject” and “generalizable knowledge” in AIHSR Projects

The current regulatory challenges IRBs are facing when reviewing novel technologies, specifically AI, is identifying when the use of AI in research constitutes human subject research. Taking AI as we understand it and the federal definitions of “human subject” and “research” feels like we’re handling a shape sorting toy, where instead of putting a square block into a square hole, we’re trying to shove a misshaped block into a toy that doesn’t even have the shape we’re holding. Before we jump to the conclusion that AI doesn’t fit the current regulatory framework, however, let’s take a look at how it does. 

Defining Human Subject: First and foremost, when we think of AI, we might be thinking “complicated technology” or “algorithms”, but what we need to be thinking is simply “data”. Next, we must understand the difference between human-focused datasets and not human-focused datasets. Identifying these differences from the beginning of the project should help IRB’s and researchers identify what projects fall under their oversight, and which do not. 

Human-focused datasets are just what they say: they are datasets used or created to understand humans, human behavior, or the human condition. Not human-focused datasets, on the other hand, might involve human data. However, the difference is this type of AI research is not meant to help us understand humans, human behavior, or human conditions, and would not generally be considered AI HSR as these usually focus more on products and processes. This is in general alignment with the current framework but differs in that the line isn’t always clear. The reason for that is, oftentimes, the datasets are intended to serve both purposes. In that case, the project should still be considered human-focused. 

Take for example, the datasets collected on social media compared to patient healthcare datasets. Both could technically fall under human focused or not-human focused depending on the intended purpose of the data and/or AI role (i.e., what the AI is intended to accomplish). If the AI is used to help us understand human behavior or health conditions, then we would call it a human-focused dataset. If the focus or role of the AI is solely to improve a platform, product or service, then the project is likely not human subject research. 

Using the current definition provided in the Revised Common Rule, we then need to identify if the project meets the federal definition of “human subjects”. In other words, does the research involve a living individual about whom the investigator obtains information through interaction, and uses studies or analyzes that information? 

Often, IRBs are presented with applications that claim the study is not involving human subjects, or that the data is collected from humans but not “about them”. Rather than take that claim at face value, we need to start with two questions:

1) Is it human-focused data?

2) Is the study intended to contribute generalizable knowledge?

As a vast majority of AI studies are intended to learn and model human behavior, getting these questions at the forefront is key. If the AI is intended to model human behavior these studies generally meet the first part of the federal definition of human subject.

Once we get that squared away, we want to remember the second part of the definition of human subject. As recently introduced through the Revised Common rule, to be human subjects, the PI has to either conduct an intervention or interact with the participant, or they can simply obtain, use, analyze, or generate identifiable private information. One might argue that if the data is neither private or identifiable it is not human subject. But what we are seeing now in many AI studies is that AI is dependent on large datasets and linking datasets to other datasets (both private and public) which opens up the possibility of “generating” identifiable information. We also see the extensive use of biometric data such recorded face or voice print, ocular scans, and even gait, which are all considered identifiable information. Taking these things into consideration will help IRB’s make HSR determinations.

The next question we must ask is if the project meets the federal definition of research. We define research as:

“a systematic investigation including research development, testing, and evaluation designed to develop or contribute to generalizable knowledge.”

What most IRBs are challenged with these days is fitting algorithm development, validation, and evaluation and its role in the larger study within this definition. Here lies the most challenging aspect of making Human Subject Research determinations- it requires a common understanding of what constitutes “generalizable knowledge”. 

For now, we as IRB professionals, understand generalizable knowledge to be:

“information where the intended use of the research findings can be applied to situations and populations beyond the current project.”

With this definition, IRBs can determine, based on the study aims, and role of the AI in achieving those aims, if the project is “research” per the federal definition. However, currently there is no federal definition of “generalizable knowledge” so the determination is made inconsistently and subjectively as a result.

So Now What?
In contrast to the current Common Rule guidelines, the FDA and other regulatory bodies have published quite a bit of guidance around where algorithms fit within their larger framework of Software as a Medical Device and have numerous resources available for IRBs, sponsor investigators, and manufacturers.

So, until a definitive policy or guidance is set for AI HSR under the Common Rule, institutions may want to incorporate into their review processes some of the FDA considerations available now, even if the projects aren’t always FDA regulated. This encourages review consistency across projects as well as to ensure various requirements such as the General Data Protection Regulation (GDPR) or 21 CFR Part 11, if applicable, are being met. Note: flexibility is encouraged, depending on the project, as the protocol may not call for some, or any of these additional protections. 

The current regulatory framework, including guidance from the FDA and under which IRBs use in the oversight of human subject research, has been in place for decades and is updated regularly as society and research evolves. Most recently, for example, the Revised Common Rule brought about several changes that were intended to streamline processes and reduce regulatory burdens. As such, while AI as a technology is not new, its use in human subject research is expanding at a rate at which we, as oversight bodies, can no longer use the “wait and see” approach. We are called to take action and are challenged with keeping up with this rapidly changing technology as study designs are beginning to implement it for investigational and non-investigational purposes. Just like we’ve always done, we are being called to look at what we have and how to improve upon it to meet the changing field. I argue that if we start with shared definitions of human subject and “generalizable knowledge”, our mission will be much less challenging.