Introduction to Responsible AI
Models in today’s world have a real, tangible, and sometimes life-changing impact on the lives of real people, bringing to light an important new side...
April 28, 2021
On Wednesday April 21, the EU released a proposal to regulate artificial intelligence (AI). The document aims to limit what they classify as ‘High Risk AI’ – namely, those systems that have tangible and potentially life-changing impacts on people’s lives. Governmental bodies, by their very nature, are very slow moving and can fail to grasp all of the edges of fast-moving areas such as artificial intelligence. So, how can lawmakers the world over get their arms around something as hyper-evolutionary as AI?
From our perspective, there are a few options that can be employed to keep the machines and their creators free to innovate while keeping bad actors, or biased algorithms, from harming the world.
The first, and potentially most obvious, way to regulate AI is to regulate the types of outcomes that are useable in our world. This includes things like setting accuracy thresholds for use in the public domain, as well as setting standards for evaluating how a model’s performance changes over time and making sure that people can intervene as necessary. While the regulation does touch briefly on this idea, it isn’t a huge part of the legislation, which makes sense. It should go without saying – models that perform badly should not be used to make decisions, period. This includes models that perform well on certain populations but not others.
Evaluating the success of a model can depend on the project that you’re working on. In some cases, you might want to prioritize the precision of a model, or ensuring that whatever predictions a model makes are likely to be correct. In other cases, it makes sense to prioritize getting a high number of class you’re looking for identified, even when it means having a lower percentage of true classifications (this might mean your model guesses the thing it is looking for a lot so it doesn’t miss any), a measure called recall. When both of these are important, a metric called F1, a harmonic mean between precision and recall, can be used. Understanding the different ways to evaluate success is one of the first things to think about when building a model for production in the first place, and so ensuring that these meet some threshold (or a theoretical threshold), seems like it is a baseline expectation for deploying a model.
Understanding the output of a model is important, but that alone isn’t enough to ensure it is not causing harm – understanding and regulating how these outputs are used is just as important.
The second way to regulate AI is to regulate where and how it can be used in society. This is where the bulk of the legislation focuses. Title II of the proposed EU regulation uses a risk-based approach to prohibit AI that creates an “unacceptable risk.” AI that violates fundamental rights, is manipulative of a person’s behavior, or that exploits vulnerable groups (such as children or those with disabilities) are specifically mentioned. Interestingly, the regulation also prohibits any type of social scoring system and the use of “'real time' biometric identification systems …for the purpose of law enforcement,” except under limited exceptions. Other types of AI, while allowed, are expected to follow protocol around documentation and uses as well.
While the regulation takes many important steps, some argue that it doesn’t go far enough. The biggest argument here is that it provides a sweeping exception for the military, which many people point out is one of the most high-risk applications that exists. Still, regulating the use of AI helps to prevent many of the harms to individuals and populations that are most worrisome to most citizens.
Currently, AI systems that are used to hire someone or predict criminal recidivism do not have outputs that meet the thresholds for deployment, nor do the users seems to benefit from them. These are all systems that are heavily subject to human bias and intervention, but these are issues being addressed every day in the research. It is possible that similar models will address the current concerns and be ready to deploy to society sooner rather than later. How can regulation evolve as the technology progresses?
Many of the processes outlined in the proposed EU regulation define the ways that those who build AI need to document their systems, add transparency to the outputs and processes, and monitor and assess their systems. This ranges from registering AI systems to a EU database to needing to constantly assess and reassess the models for accuracy and drift. Regulating the creation of the models is the bulk of the EU legislation, and at first, can seem worrisome to those of us who create AI systems – these regulations feel like they stifle the creativity and innovation that is crucial to technological advancement.
Luckily, the regulation takes this directly into account and in fact defines “measures in support of innovation.” This includes things like building sandboxes or controlled development areas that fit regulatory standards, in which creators can test their AI systems. There are also a number of measures that help to minimize the legal and financial burden that these regulations pose, particularly on startups and small and medium businesses. Balancing protecting society with allowing for the rapid changes that come with technological advancement is tricky, and there will almost certainly be a learning curve to this process. CallMiner’s Responsible AI framework attempts to strike a similar balance.
To this point, there have been ways to regulate the machinery and its uses, but there has not been a way to regulate that which drives the decisions – the data itself. Data is the key to building models, but we know that raw data contains many biases in itself. Many of these biases reflect the world we live in and may not be easily fixed. How do we gain oversight over the data?
Perhaps the most difficult, although potentially most important, thing to regulate is the data that is used to train and evaluate AI systems. A key axiom in building AI systems is “garbage in, garbage out”, a saying that is meant to describe the idea that a model is only as good as the data on which it is trained. Given the fact that a model is a reflection of the data, it makes sense that this is a key part of the new regulation. This shows up in a number of different ways. The first builds on GDPR, the big data privacy act that the EU introduced a number of years ago, and includes directives like informed consent for data collection and making sure that people’s personal information can’t be extracted from the model or the dataset in a way that would compromise the individual. Other parts of the regulation specify the needs for dataset documentation, human oversight, and transparency. This is very reminiscent of a 2018 paper by Dr. Gebru et al titled “Datasheets for Datasets”, which argues that not understanding exactly how, by whom, and for what purpose a dataset was created limits our ability to understand, evaluate, and develop our AI systems.
As with any large government regulation, it will likely be years before we begin to understand what this really means for people in industry. First, the regulation will need to be passed by the EU, and the AI regulation board will need to be set up. Then, the procedures will need to be implemented, and finally the regulation will be tested in court to set legal precedence for enforcing this regulation. The key ideas in this regulation, however, are things that we should all be taking into account already.
Generally we see this proposed regulation to be a wake up call for the United States, which has taken a more specific view of data protection regulation. Unlike the EU’s GDPR, which endeavors to create a single standard for data protection, and this proposed regulation on AI, the U.S. traditionally has opted for a sector-specific approach. With U.S. regulations by industry such as HIPAA, CFPB, FCRA, and GLB, it isn’t clear how the U.S. will approach regulating AI. It seems as if the EU is proposing a set of rules that big tech will need to follow in Europe, even though the U.S. provides a significant amount of technical innovation. Much like GDPR, the new AI regulation will apply to any system that is deployed in Europe or to the European market, meaning that companies worldwide will need to begin to adhere to these standards regardless of the standards in their home country.
The biggest takeaway from this regulation is that it is everyone’s responsibility to ensure that they understand the models they are producing and how they might impact individuals and society. This regulation seems to negate the idea that ignorance can absolve legal responsibility if harms are caused.
Our approach to Responsible AI centers on documenting all of the decisions that are being made through the development cycle and trying to minimize harms before they can happen. Our goal, much like the goal of the EU regulation, is to create as much transparency as possible in the development, deployment, and use of the AI systems to protect everyone involved.