Introduction to Responsible AI
Models in today’s world have a real, tangible, and sometimes life-changing impact on the lives of real people, bringing to light an important new side...
It seems like every day we are faced with yet another headline about AI, and usually the headline isn’t a positive one. Maybe you remember the chaos around using predicted standardized test scores as truth in the UK when the test was canceled due to COVID-19. Or maybe you’ve heard that facial recognition systems, like the one used to unlock some iPhones, don't work as well for people of color. There are countless more headlines that we could point to, but all of these less than flattering headlines have something in common – the output of the model, in one way or another, put someone at a disadvantage in a way that impacted their lives. Put simply, the models caused harm to someone somehow.
In the context of Responsible AI, we use harm as a general term to cover all of the negative outcomes of models that disadvantage someone in one way or another, but the nuance of each occurrence of harm is often lost in the breadth of the term. In some of the scandals we hear about, such as the 2019 incident with the Apple Card automated credit limit assignments, the harm is caused by disadvantaging one group (in this case, women) by taking away or not allowing access to a resource (in this case, credit).
In other cases, the harm is caused by the fact that a given group of people was left out of the data, resulting in situations like the facial recognition problem above, where the model was unable to work on the faces of people of color. While each instance of harm can affect different groups in different ways, let’s explore two of the main categories of harms that AI outputs can cause: Harms of Allocation and Harms of Representation.
Harms of Allocation refers to the harms caused when resources or opportunities are withheld from a certain group or distributed unequally, particularly by a model. In these cases, models are learning the relationships between groups of people and unfair distribution of resources from the past and perpetuating them, rather than realizing that in a just world, these patterns would no longer exist. Researchers in Responsible AI are working on more visibly on these kinds of problems these days. Let’s look at two examples.
In both examples above, the output of a model drastically changed the direction of someone’s life. I think it’s unlikely that the creators of these models went in trying to perpetuate bias or had any other malicious intent. These cases, and countless others, underscore the importance of having a diverse team of people evaluate the use and output of models, considering how effective they are not only on a mathematical level but also on a social one. We’ll cover fairness in an upcoming post, but the takeaway here is that our data contains patterns we can’t always see, and sometimes the risk of a false assessment may be greater than the reward of a correct one.
Harms of Representation are those harms caused by systems that reinforce the subordination of some groups based on identity (like race, gender, etc.). Kate Crawford explains this concept in her 2017 NIPS keynote, “The Trouble with Bias”.
These systems represent society but don’t allocate resources. This includes many of the technologies, like social media personalization, search algorithms, and photo editing software that influences the way we interact with the world, as well as the ways that society interacts with us. There are a number of ways that these harms can be classified – below, we list just a few of the most common and researched, but it’s important to note that this list is not all encompassing. A model will learn from whatever data it is given, and there are many other ways that harms of representation could show up.
Denigration involves the use of culturally disparaging terms when classifying or describing a group of people. For example, in 2015, a Google Photos scandal mistakenly labeled people of color as “gorillas”. This harm is often caused by a lack of representation in the dataset. If Google Photos had more photos of people of color in their training set, it is likely that this could have been avoided.
Stereotypes involves the use or perpetuation of a widely held and oversimplified image or idea of a particular person or thing. A famous 2016 paper by Bolukbasi et al explains how word embeddings enforce many gender stereotypes, by completing analogies such as “Man is to Computer Programmer as woman is to homemaker”. Similarly, Google Translate will often change the gender inflection when translating job titles between languages. In continuing to associate men with technical and academic jobs like computer programmer or doctor, and associating women with roles like homemaker and teacher, we reinforce these roles.
Recognition occurs when a group is erased or made invisible by a system, such as when a facial recognition training set that didn’t include enough diversity failed to recognize the faces of Asian people. This is yet another instance where a change in training data may have avoided the problem.
Under representation involves a group of people being under represented in the output of a system. This is frequently debated, as it can be caused by an unbalanced dataset that might be a reflection of the real world. For example, an image search for “CEO” used to show primarily pictures of white men. Google has since addressed the problem, but now the people on the first page are predominately white or white passing individuals.
Exnomination occurs when the behavior and characteristics of the majority group are considered “normal” or “apolitical” while the same traits of minority groups are questioned and debated. This is well represented by the idea that the word “athlete” is usually implied to mean “male athlete” because if it were not male, the term “female athlete” would have been used. Similar examples exist in sports where a given sports team is often assumed to be a men’s team unless proceeded by “women’s” or “lady” or another identifying term.
Harms of Representation can largely be avoided by understanding what is and is not present in a given data set and accounting for that as you move through the stages of development. These models can be harmful immediately when the person using the technology sees the stereotype played out, but they can also reinforce the validity of these ideas in our society, making it easier for other people to perpetuate these ideas in the future.
Many of the harms we’ve discussed in this article point to deeper societal biases and histories that are present in our data. Like how fish don’t know they’re wet because they’ve only ever been in the water, it can be hard for us to see biases that may not impact us every day. This helps us understand one of the most important things we can do to minimize the harms caused by the tools we create: Involve a diverse group of people in the process of creating, understanding, and evaluating our models, our data, and their use cases. Everyone's lived experience is different based on the many identities they hold, a phenomenon that sociologist Kimberlé Crenshaw termed intersectionality. When we truly listen to the feedback from many different groups, we can more easily identify harms that might be caused to groups we are not a part of.
Alongside the involvement of a team that is diverse, there are tools that data scientists can use from the beginning to help alleviate some of the harms. The first step we should take is in really understanding our data. What features are we using in our models? Are any of the proxies for sensitive material? When we are able to sit down and think about our features, we can better understand their potential impacts before we begin training, helping us to prevent some of the harms from occurring.
The next thing that we can do is to make sure that our data set is balanced across categories and data types. This is particularly important in preventing harms of representation because we can ensure that every group is represented before training a model. Another crucial step is to make sure you test your model on a variety of subsets of your data. Finding tangible ways to compare accuracy between groups gives us a better understanding of how our models are performing, and where they need to do better.
These are by no means the only ways to help address the issue of harms created by AI. Other ways, such as picking the right metrics and unpacking fairness, could take their own blogs to explain. It can seem overwhelming to address all of the harms with all of the methodologies right away, and that’s okay. Like with all modeling projects, the key to success is an iterative process that allows for open conversation and evaluation of previous attempts. When we all work together to understand our model and check their outputs, we can work together towards building a more just world.