In criminal justice circles, “big data” is the new buzzword: police departments are experimenting with the application of computer algorithms to vast amounts of digitized data to predict the future geographic location of crimes, to identify those people likely to become involved in gun violence, and to assess future criminality for the purpose of setting bond amounts and determining sentences. It turns out, though, that algorithms have problems. They can reflect the biases and choices of the humans who create them. They can also be plain wrong.
Besides algorithms, there is a more basic problem. The data itself can contain countless mistakes, inaccuracies, and discrepancies. While the wrong address, the invalid warrant, and the mistakenly recorded conviction don’t sound like particularly new problems (they aren’t), they represent an urgent but overlooked issue in our information-dependent world. This data determines how the government distinguishes between the dangerous and the low-risk, those who should be arrested and those who should be left alone. However, as Wayne Logan and Andrew Ferguson point out in their insightful and important article, Policing Criminal Justice Data, this “small data” is too often dead wrong. To make matters worse, there is little incentive for government agencies—at any level—to care. Their discussion is a must-read for anyone interested in the increasingly important role of information distribution and control in criminal justice.
Criminal justice information errors have enormous costs in the lives of ordinary people. Consider the problem of an erroneous arrest warrant, wrong perhaps because it is meant for a person whose name is close enough to, but not exactly like, yours. Should that mistake lead the police to arrest you, you may—indeed are likely to–become subject to a search of your person, and perhaps later a strip search in jail and a compulsory DNA sample. A night in an overcrowded and sometimes dangerous jail isn’t just a loss of liberty; it’s an exposure to some very real harms. The resulting arrest record may also harm your future chances of employment and much else.
Should we care? Absolutely. At the individual level, such mistakes can be not only demeaning but ruinous in a very practical sense. Every mistaken arrest hurts the victim, and misdirects government attention to the wrong places. More abstractly, collecting and generating so much information about its citizens obliges a government to make reasonable efforts to guarantee data accuracy. Without that expectation of good faith, the government risks our trust in it.
The legal remedies for these mistakes are weak. Take, for example, the options of a person who has been the victim of an erroneous arrest, because her name was similar to the one on an outstanding arrest warrant. Police have considerable latitude to make mistaken arrests. In addition, recent U.S. Supreme Court case law has narrowed the scope of the exclusionary rule in cases of “good faith” mistakes. Civil suits against the police are no better, since qualified immunity typically protects them from liability in most cases of informational mistakes. Even where legal avenues are formally available for victims of the government’s data errors, few people have the time, resources, or expertise to challenge the black box of most government databases. The federal and state governments have little incentive to change the situation.
Logan and Ferguson, after having explained this abysmal state of affairs, offer thoughtful solutions that address much-needed institutional changes. The federal government, in particular, can be an important driver of reform since federal money has played such a large role in the growth of state criminal justice databases. Federal resolve to improve data quality might take the form of quality assurance measures, such as mandatory audits. States too can play a critical role in providing individual legal remedies for criminal justice data errors. An underlying theme here is the importance of cultural change: caring about data quality is perhaps even more important than the details of its practical implementation.
We live in an age of the algorithm, but we also live in the age of mass information. Nowhere is the cost of mistaken information more tangible than in criminal justice. As Logan and Ferguson so persuasively show in Policing Criminal Justice Data, those data errors are at the core of government trust and accountability.
Jane Bambauer, Hassle
, 113 Mich. L. Rev.
Every Fourth Amendment scholar is familiar with the concept of “individualized suspicion.” The classic example comes from Terry v. Ohio, where Officer McFadden watched two men walk up and down in front of a storefront numerous times, consult with another individual, and then return to checking out the storefront. The Supreme Court held that, while McFadden did not have probable cause for arrest, he had a “particularized” belief that the three men were up to no good and thus could stop them and, when they gave unsatisfactory answers about their activity, frisk them as well.
That type of case is often contrasted with what are sometimes called “suspicionless” searches and seizures. The classic example of that type of police action is the license or sobriety checkpoint that stops individuals who drive up to it. The Court has indicated that such seizures are permissible despite the absence of suspicion that any particular driver seized has an expired license or is drunk, as long as the police stop everyone who comes to the checkpoint or rely on neutral criteria in deciding whom to stop (such as whether the car occupies a pre-selected position in line).
Although to most the distinction between the two situations is intuitive, it is blurrier than it might initially appear. Seizures at license checkpoints are based on suspicion in the sense that the department operating them believes that a certain percentage of drivers stopped will have expired licenses. Thus, while the suspicion with respect to any particular driver is very low, it is still the case that every car stopped at the checkpoint is associated with some degree of suspicion. At the same time, one could say the stop in Terry was based on the same type of “generalized suspicion” involved in the checkpoint scenario, in the sense that Officer McFadden was operating on preexisting stereotypes about the behaviors that are consistent with burglary.
As modern policing increasingly relies on algorithms and profiles, the connection between “suspicion-based” and “suspicionless” searches and seizures will become increasingly obvious. Facial-recognition technology, data-mining algorithms, hot-spot policing, and other predictive policing techniques allow police to scan large segments of the population for suspicious activity or individuals. Although these techniques function like checkpoints, they are based on calculations that the individuals identified are more likely to be involved in criminal activity than those who do not fit the profile.
Enter Jane Bambauer’s article Hassle. Bambauer begins by making clear why the word “individualized” in the phrase “individualized suspicion” obscures the fact that, in both Terry-type cases and checkpoint-type cases, police who conduct searches and seizures are acting with some quantum of suspicion about the person, entity, or item affected. The only difference is that in the situations usually thought of as individualized suspicion cases, courts specifically discuss whether that quantum is sufficient, whereas in “suspicionless” cases (often involving what the courts call “special needs”), they don’t.
Bambauer also debunks the idea that individualized suspicion is somehow more accurate or more desirable than generalized suspicion. Scholars have decried the use of profiles on the ground that they have significant error rates. But so do all searches and seizures. Some factors—such as race—should never appear in profiles, both because using such a factor is particularly repugnant and because race is not a very good predictor of crime. But, as the example with Officer McFadden illustrates, even cases we call “individualized” ultimately rest on profiles.
Others have made this point. As Bambauer notes, Fred Schauer has stated: “[O]nce we understand that most of the ordinary differences between general and particular decisionmaking are differences of degree and not differences in kind, we become properly skeptical of a widespread but mistaken view that the particular has some sort of natural epistemological or moral primacy over the general.” The more innovative part of Bambauer’s article—the “hassle” part—is the explication of how the individualization requirement has inadvertently acted as a break on dragnet searches and seizures. As Bambauer defines it, hassle measures the chance that an innocent person will experience a search or seizure. When courts require the cop on the street to have “individualized,” as opposed to generalized, suspicion for a stop, they are not only requiring officers to have good justification for their actions but also implicitly prohibiting police from hassling large numbers of innocent people. As Bambauer puts it, “individualization has kept hassle low by entrenching old methods of investigation,” methods such as relying on tips and individual conduct rather than technologically-oriented panvasive techniques.
One might react to this point by concluding that the courts’ take on individualization is a good thing. But not Bambauer. She points out that modern techniques can improve policing by reducing error rates, limiting reliance on vague suspicion factors such as “nervousness” or “bulges” (which can often be covers for race), and making policing more evidence-based. Bambauer also recognizes, however, that these techniques come with a cost—a potential for increased hassle. Thus, she argues that the Fourth Amendment requires attention not only to “hit rates” (the suspicion part of individualized suspicion) but also to hassle rates (the number of innocent people affected by a given police technique). She suggests that hassle can be limited through keeping profile programs small or through randomization that reduces the number of people affected by the search or seizure. Another possibility—most likely relevant when, as with checkpoints, significant hassle cannot be avoided—is to ensure hassle rates are explicitly contemplated and authorized by a legislative body representative of those people likely to be affected by the search or seizure.
Bambauer begins her article with a hypothetical. Suppose an officer comes to a judge seeking a warrant based on a methodologically sound study showing that 60% of Harvard dorm rooms contain drugs. The officer also provides the judge with an affidavit listing ten dorm rooms selected through a random number generator and stating that no other dorm rooms will be searched on the basis of the study. The first piece of information provides the hit rate (a high one). The second ensures that the hassle rate will be low. Bambauer thinks the warrant should issue. Whether or not you agree, her article points the way to interpreting the Fourth Amendment in a way that better regulates old techniques and provides a methodology for evaluating new ones.