I. INTRODUCTION
Automated decision-making plays an increasingly larger role in policing.1 Traditional methods of police investigation have been augmented by tools like facial recognition, predictive analytics, license plate readers, and robotics.2 These tools allow the police to sift through large amounts of information at a scale and speed not practicable with human skills alone. This reliance on artificial intelligence, however, has prompted numerous questions about how to balance criminal investigation needs with concerns about fairness, bias, transparency, and accountability. These concerns aren’t unique to policing. You can find similar calls for “algorithmic accountability” in healthcare, banking, credit scoring, public benefits, and employment.3
How should we evaluate the growth of automation in policing? There is no shortage of answers, but this Article starts with a simple observation: by focusing on automation’s harms to persons first. American policing is rife with reckless automation. The highly decentralized system of policing in the United States, with its more than 18,000 agencies,4 permits and encourages experimentation with new technologies. Innovation in policing can, of course, be positive. Crime control and public safety are complex and evolving social problems, and over time, the police change their tactics and tools to address them.
But new technologies that rely on artificial intelligence and the vast amounts of digital information now available have introduced new problems.5 Police departments have bought, licensed, adopted, and experimented with technologies that impact communities through increased but invisible surveillance, and with mistakes that impose real-life consequences in policecivilian interactions. And these technological experiments are often deployed in places or against communities that have already been overpoliced. Those who are disproportionately and frequently affected by these experiments are black, brown, low-income, and without significant political power. We should identify this development as reckless automation in policing.6
Reckless automation has tangible consequences: its mistakes lead to street stops, arrests, and traffic stops of individuals. Communities also experience the psychological costs of pervasive (and sometimes barely visible) automated surveillance. Sometimes these policing experiments are conducted without the knowledge of the communities involved.
If we accept the premise of reckless automation, the conversation about accountability, artificial intelligence, and policing might benefit from a seemingly unrelated policy framework: that of experimentation on human subjects. The comparison may seem far-fetched. Yet even the police may think of their new technologies in the same way that the medical community approaches experimentation. The Los Angeles Police Department, for instance, referred to amount of police time spent at a place identified by a predictive policing program as “dosage.”7 Borrowing from that framework does not imply that reckless automation in policing is the literal equivalent of medical or psychological experiments on human subjects. Nor does such a comparison imply that the technical aspects of institutional review boards should apply directly to new policing strategies.8 But turning to a bioethical framework has value because it draws attention to the subjects—the communities affected—of policing. To the extent that the ethical considerations applied in human subjects research provide useful insights to apply to the many changes in policing, they open a new conversation. What if we think of new forms of automated decision-making in policing as experiments on communities that might impose harms with life-altering decisions?9
II. AUTOMATION AND ACCOUNTABILITY
We all know about the influence of artificial intelligence and automation in our lives, from the most mundane experiences, like picking favorite songs or movies, to more important decisions like who should receive job interviews or who should receive government benefits.10 That automation is also a part of policing. License plate readers today use algorithms to quickly identify individual plates. Facial recognition technology can quickly identify faces in fixed databases or in real-time scans. Software identifies high-risk persons and places. All of this automation falls under the umbrella of “artificial intelligence”: the use of machines to assume cognitive tasks usually performed by people.11 That is a very broad definition, and rightly so: artificial intelligence can involve everything from the application of straightforward algorithms12 to more complicated examples of machine learning that adapts to identify patterns in data.13 Some artificial intelligence simply provides more information for human decisionmakers (risk assessments in finance14), while other forms perform the analysis and the action (hiring and employment decisionmaking).15
But automation also poses questions about bias, secrecy, unaccountability, and mistakes that are hard to spot when the decision originates from a machine and not a person.16 While the United States lacks a comprehensive data protection regime, many regulatory approaches have been proposed and some have become law. These proposals can apply to automated decision-making generally, or to more specific subject matters like policing and criminal justice.
We can summarize some of the predominant approaches to ethics and accountability in artificial intelligence.
First, because machine learning involves the identification of patterns from enormous data sets, the constitution of that training data can be a problem.17 If a dataset of faces has many more white people then non-white people, then a facial recognition program instructed to identify faces can misidentify those who are not white at much higher rates than whites.18 One study of facial recognition algorithms found that that while white men were correctly identified nearly all the time, black women were incorrectly identified up to a third of the time.19 Put simply, biased data will lead to biased results.20 This concern has prompted both calls for better training data and for bans or pauses on the use of facial recognition technology until these problems have been addressed.21
Relatedly, there can be bias in the algorithms themselves. People create algorithms, and their assumptions about the appropriate design and execution of the algorithm may create further biases. A recruitment algorithm, for instance, that uses men as the model for professional “fit” will disadvantage female applicants.22 Of course, bias is not a new idea. But in this context, bias—whether in training data or in the instructions themselves—can magnify inequalities by reproducing these effects on a very large scale. Some scholars have proposed as a solution public access both to source codes and data sets.23
Another proposal in artificial intelligence policy is the call for explainability. With some complex uses of artificial intelligence, programmers may not be able to explain exactly why or how particular outcomes have been achieved. This black box problem means that a person may not be able to know why a particular prediction or decision was made.24 Thus, there have been calls for a “right to explanation” when, for example, a person receives an adverse employment decision through automation.25 And although calling for the explanation of why an automated process led to a person’s rejection for a loan, a job, or benefits has appeal, there is not yet widespread consensus on what form that explainability should take.26 The right to an explanation can be combined with other tools from a legal framework to provide individuals with “technological due process” rights in automated decision-making.27
Another critique of opacity in automated decision-making arises because these tools are often developed within the private sector.28 Government entities, including law enforcement agencies, typically do not design or create these systems.29 Instead, public agencies usually stand in a customer-vendor relationship with private companies and then adopt the tools of algorithmic decision-making as a matter of purchase, lease, or contract.30 These relationships complicate accountability considerably. If a person receives an adverse decision for government benefits because of a prediction tool developed privately, the agency may be unable to provide an explanation for the reasoning because the vendor invokes its proprietary interests and refuses to provide information. Criminal defendants have encountered problems, for example, in trying to access the source code for the privately developed “probabilistic typing” software that has identified them as a suspect by analyzing DNA samples that are usually too difficult for traditional forensics labs to assess.31
These approaches to algorithmic accountability each identify important problems in the uses of automated decision-making. Each proposes solutions that can be implemented in new regulations and agency decision-making. And all have influenced efforts to regulate automated decision-making around the country. In policing, concerns about secrecy, for instance, have led some local governments to impose notice and reporting requirements on new uses of surveillance technologies by their police departments.32 But all of these share a similar premise: that algorithmic accountability address a technological process that requires new forms of regulation in order to be implemented fairly. These are solutions for fixing machines. What if we started somewhere else?
III. A DIFFERENT FRAMING: EXPERIMENTATION
The ethical review of research involving people today is standardized and grounded in historical experience. Biomedical and behavioral research involving human subjects is generally subject to institutional review boards that focus on the potential ethical consequences of that work. Federally funded research must abide by federal regulations regarding human subjects, including informed consent procedures.33 “Human subjects” refers to any “living individual about whom an investigator” then obtains “information or biospecimens through intervention or interaction with the individual and uses, studies, or analyzes the information,” or does the same for “identifiable private information.”34 Additionally, research is defined as “a systematic investigation, including research development, testing, and evaluation, designed to develop or contribute to generalizable knowledge.”35
Heavily influential to the system for protecting human research subjects today is a report written in 1978 by the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research.36 Also known as the Belmont Report, the Commission’s work was prompted by notorious abuses of human research subjects, including the 1972 reporting on the infamous Tuskegee Study.37 In that experiment, the U.S. Public Health Service offered to treat 600 African American men “for bad blood” in exchange for meals, medical exams, and burial insurance.38 They were not informed that the actual purpose of the study was to examine the effects of untreated syphilis, and were denied access to a cure. In writing the Belmont Report, the Commission took the view that “risk-laden, albeit promising research” might not be justified “merely on the strength of its potential social benefits.”39
The hallmarks of the Belmont Report are its three fundamental principles: respect for persons, beneficence, and justice. A respect for persons includes the assumption that “individuals should be treated as autonomous agents” whose “considered opinions and choices” are entitled to respect.40 Beneficence requires efforts to secure the “well-being” of research subjects, including maximizing possible benefits and minimizing possible to harms to them.41 The third principle the Report emphasizes is justice: that “an injustice occurs when some benefit is entitled without good reason or when some burden is imposed unduly.”42
For human subjects research, these three principles translate into practical steps: the use of informed consent by research subjects, a risk/benefit assessment about whether to perform the research, and the careful selection of subjects. As to this final concern, the Belmont Report raises this note of caution:
Injustice may appear in the selection of subjects . . . . Thus injustice arises from social, racial, sexual and cultural biases institutionalized in society. Thus, . . . unjust social patterns may nevertheless appear in the overall distribution of the burdens and benefits of research.43
These three considerations—respect for persons, beneficence, and justice—are useful rubrics for thinking about the ethics of technology in policing. The technologies on which the police increasingly rely all promise advances in how investigations are conducted. But their costs, whether through mistaken arrests and stops or pervasive surveillance, have yet to find a meaningful ethical framework.
IV. AUTOMATION, BIOETHICS, POLICING
How can such a very different conversation on law and policy regarding human subject research inform the one in police automation? The algorithmic accountability movement has offered many proposals to address the problems of automation. But these conversations focus first on the technology: how to modify, reform, and monitor both the data and design of automated decisionmaking. All of this remains important. But to pose the question as the beginning of this Article: what if we think of new forms of automated decisionmaking in policing as experiments on communities that might impose harms with life altering decisions?44
First, we know that surveillance of all kinds is distributed unevenly in society. Receipt of public benefits can often require subjection to drug tests, fingerprinting, verification requirements, and privacy-intrusive questions.45 Non-white, low-income communities are more often subjected to heavyhanded police presence and surveillance than their wealthier and whiter counterparts.46 Online, low-income Americans face disadvantages because they usually buy less expensive digital devices with fewer privacy protections and possess fewer digital skills to keep their information private.47 Low-wage work is particularly subject to tracking about movements, productivity, drug tests, and other forms of surveillance.48
In addition, non-white, low-income communities are not just subjected to more surveillance, but also more combinations of surveillance than other groups.49 The potential harms from living with pervasive and inescapable surveillance are quite real.50 People in low-income communities can find it difficult to protect their privacy and to correct mistakes that lead to adverse decisions in housing, credit, and policing. This means some communities live with less autonomy and freedom from surveillance than other groups do.
In the case of policing, we might consider the decision to “test” out a new automated decision-making on a community as a form of experimentation with impacts that might benefit from the ethical concerns in human subjects research. When a law enforcement agency decides to pilot or adopt automated decision-making today, it might do so without informing or receiving consent or input from the community policed; without explicit consideration of whether the experiment maximizes benefits while minimizing harms; or without considering whether a program unduly burdens that community. We would not condone a medical experiment on a community with potential psychological and physical impacts without ethical approvals. But none of the ethical principles fundamental to human subjects research are usually considered for new policing technologies.
The absence of such ethical considerations is striking when we know that police departments do in fact experiment with and sometimes have abandoned automated decision-making programs that have significantly impacted the communities affected. Consider two recent examples.
In 2012, the Chicago Police Department began to use risk models, popularly described as its “Heat Lists,”51 to identify those who were likely to become victims or perpetrators of gun violence within the next eighteen months.52 A research team from the Illinois Institute of Technology created the risk models and calculated the “scores” for individuals—everyone who had been arrested in four years before the calculations were made.53 The higher the risk score, the higher chance that a person would become a “Party to Violence,” either as a victim or perpetrator of gun violence.54 These scores were available to all Chicago Police Department personnel, as well as on its crime mapping software.55 By 2018, there were nearly 400,000 people with individualized risk scores under the program. The majority of black men in Chicago between the ages of 20 and 29 had a risk score under the program.56
It its 2020 report, the City of Chicago’s Inspector General found the department’s predictive program filled with “concern[s].”57 The individualized scores and risk tiers used in the program were found to be “unreliable.”58 In addition, the scores, premised on arrests that did not necessarily lead to conviction, may have been the basis of police interventions that “effectively punished individuals for criminal acts for which they had not been convicted.”59 Having a high risk score may have led to some people receiving harsher charging decisions in subsequent arrests, even if they had never been convicted for the prior arrest.60 The Department formally decommissioned the program in November 2019.61 The risk assessment program used in Chicago is an example of reckless automation. Armed with a new data-driven project and $3.8 million dollars in federal funding, the police department experimented with a program that led to numerous “Custom Notification Program” interventions: visits to the homes of persons identified as high risk.62 These visits were formally described as opportunities to connect high-risk persons to social service programs, but in many cases may have been no more than “going door-to-door notifying potential criminals not to commit any violent crimes.”63
Consider another example: the mistaken arrest of Robert Julian-Borchak Williams.64 Detroit police had been investigating the theft of $3,800 worth of watches from a local store in 2018. An examiner for the Michigan state police uploaded a still image from the store’s surveillance video to the state’s facial recognition database. The system would have searched for potential matches in a database of 49 million photos. The facial recognition technology used in this investigation was supplied by a private company that began as a mugshot management software company, which then added facial recognition tools developed by subcontractors.65 But the private company that contracts with the state does not measure these systems for accuracy or bias.
In Williams’s case, the Detroit police would have received a report with potential matches generated by the software. Algorithms incorporated into the software used in the case were also found in a 2019 federal study to have significant inaccuracies in identifying African-American and Asian faces as compared to Caucasian ones.66 A store employee identified Williams from a photo lineup generated from that report. Confronted with the video still, Williams asked: “You think all black men look alike?”67 Detroit detectives eventually acknowledged that they arrested the wrong person. But that admission came after Williams had been handcuffed and arrested at home in front of his wife and daughter, had his mug shot, fingerprints, and DNA sample taken, and had been held overnight in jail in 2020.68 Williams subsequently sued the Detroit police for his wrongful arrest.69
The pivotal role of facial recognition in this widely publicized wrongful arrest will be discussed with the terms of algorithmic accountability, but it is also an example of reckless automation that harms the principles of respect for persons, beneficence, and justice. Should the police in Michigan have decided to use an automated system which was demonstrated to make racially disproportionate mistakes, and thus harms, to people? Does such a decision respect the autonomy of persons affected? Can we conclude that such a decision demonstrates attention to “fair procedures and outcomes” affecting the groups and individuals involved?70 Was there special attention to the fact that any potential harms and mistakes would affect racial minorities disproportionately?71
V. CONCLUSION
The growing calls for algorithmic accountability in the tools of artificial intelligence merit the attention they received. These calls for attention to bias, secrecy, and inaccuracy have special importance in the use of artificial intelligence in policing, where mistakes and biases can impose life-altering consequences. But these approaches share a common premise: that we need fix the problems of these technological tools. Improve the machines, and we improve automated decisionmaking. Until and unless these reforms happen, however, there are people subjected to and harmed by these flawed decisions.
Consider a different perspective: characterizing some adoptions of artificial intelligence in policing as reckless automation that might benefit from the conversations in human subjects research ethics. The long history of ethical lapses in human subjects research has prompted a robust framework that asks fundamental questions about individual consent, community impact, minimizing harms, and special attention to racial minorities, among other groups. The algorithmic accountability conversation brings an important perspective about technological processes to policing. This Article urges that a bioethical perspective can offer a perspective on the human impacts of policing automation. Even in an age of automation, people remain most important.