Using unethical data to build a more ethical world

How CallMiner handles imperfections in speech recognition

Abstract

Data scientists use data to train models. Those models calculate probabilities to capture patterns in the data. It’s difficult to build ethical models when the available training data contains racism, sexism, or other stereotypes. Contact center data, including calls, chats, texts, and emails, is no exception. Instead of building a model to automate decision-making processes, we use the unethical findings from our model as an insight. We discuss debiasing options for removing racism from the model but find that removing this bias removes a crucial insight that an analyst deserves to know. By leaving the model with all the biases learned from the training data, we can provide better analytics. Analysts can recommend solutions that start to dismantle the systemic racism present in our society. Debiasing is not always appropriate. Censoring the model makes it harder to identify what can be done to prevent racism in our procedures and society.

Read the full paper here.