This website uses cookies

Read our Privacy policy and Terms of use for more information.

Introduction

From its inception, Artificial Intelligence (AI) has been plagued by bias. Early AI systems were trained on inherently limited datasets that often reflected existing societal inequalities. This resulted in algorithms that perpetuated stereotypes and discriminatory outcomes for certain groups. One famous example is a 1988 investigation that found a UK medical school's AI admissions software discriminated against women and applicants with non-European names.

Another famous one came much later. In 2016, Microsoft released Tay, a chatbot designed to learn from Twitter conversations. Within hours, Tay's language turned offensive and racist, reflecting the toxic content it had been exposed to online. An avalanche of racist statements and obscene narratives was a powerful example of how quickly AI can absorb harmful biases from its environment. For some time, companies were very afraid to set their chatbots loose. All changed on November 30, 2022, when the world was introduced to ChatGPT. Today we explore what we learnt since Tay’s failure about biases. Our focus will be on:

  • What is bias, and why it is a big problem for foundation models/LLMs?

  • Does bias only come from data?

  • How to identify biases?

  • Countermeasures: debiasing techniques

  • Tools and libraries for bias detection and mitigation 

  • Actions for different stakeholders.

  • Conclusion

  • Research papers

What is bias, and why it is a big problem for foundation models/LLMs?

Foundation models, including large language models (LLMs), represent a significant advancement in artificial intelligence (AI) systems capable of processing, generating, and understanding human-like language. These models have garnered immense popularity recently, driven by their ability to perform a wide array of tasks with remarkable accuracy. Beyond text classification, sentiment analysis, machine translation, and answer generation, foundation models extend their utility to fields such as image recognition, autonomous systems, and even creative arts, showcasing their versatility and broad impact.

At the heart of these models lies the question: from where do these billions of parameters, that guide their "understanding" and outputs, derive their knowledge? Foundation models are trained on extensive datasets encompassing text, images, and sometimes audio from the internet, books, articles, and other media. This vast, diverse corpus of human knowledge enables them to learn patterns, relationships, and contexts, forming the basis for their intelligence and capabilities.

However, the breadth and depth of knowledge these models can access also introduce challenges. The data used for training these models reflect the biases, inconsistencies, and varied quality present in the source materials. These biases can become catastrophic when implemented in a generalized form on life future-defining decisions for genuine people with their own unique stories. 

When deployed in areas such as loan assessment, job employment, law enforcement, healthcare, customer service, social media moderation, and organizations moving more and more towards AI-generated content and employee pipelines, these biases, even in the ballpark of ~0.1%, can leave thousands and millions of people victim to undeserving loss in opportunities.

Does bias only come from data?

No. 

Biased data: As the example of Tay, the chatbot, showed years ago, much of the bias originates from the massive amounts of text scraped from the internet, which often reflect real-world prejudices and stereotypes.

Unbiased data: But algorithms can amplify existing biases through their design, even with unbiased data. For example, if a web search engine consistently ranks certain results higher, those become more interacted with and deemed more popular, regardless of their intrinsic value. This user behavior data influences the algorithm's future decisions, creating a feedback loop where initially biased placements lead to increased visibility for those top results, perpetuating and magnifying the bias over time.

Image Credit: A Survey on Bias and Fairness in Machine Learning

How to identify biases?

Various techniques and methods are utilized to identify biases in foundation models and LLMs while being developed or deployed. Several strategies warrant consideration and have been discussed in detail with the latest EU proceedings over Safer AI:

  • Analyzing and Auditing Data: Carefully review the data you use to train your model. Look for any signs of bias, such as the underrepresentation of certain groups or perspectives. Regularly audit your data to spot any patterns indicating bias.

  • Using Iterative and Diverse Training Data: Make sure the data for your model includes a wide range of views, cultures, and demographics. This approach helps lessen the biases that might come from datasets that don't fully represent the diversity of the real world.

  • Gathering Stakeholder Feedback: Reach out to people, especially those with expertise in the subject matter and individuals from groups that could be affected by the model's outcomes. Their insights can help you identify biases you might have missed.

  • Enhancing Transparency and Explainability: Strive to make your model's decision-making process open and understandable. Be clear about how your model arrives at its conclusions and be ready to explain any biases that come to light.

Countermeasures: Debiasing Techniques

It’s important to remember, that there is no silver bullet to eliminate bias. The best approach often involves a combination of technical, procedural, and governance strategies. Also, the constant awareness that bias can crawl in. The appropriate debiasing technique depends on the specific use case and type of bias encountered.

Let's explore a few:

  • Data Augmentation: One of the underlying causes of bias is unbalanced or unrepresentative datasets. Data augmentation increases the size and diversity of datasets by adding modified versions of existing data or synthetically generated data points. (Read Token 1.14: What is Synthetic Data and How to Work with it?

    This helps ensure that the model learns from a wider range of examples, decreasing the chances of it overfitting to a particular group.

  • Counterfactual Fairness: This approach tackles bias by asking "what if?" It tests a model's decision by changing a sensitive attribute (e.g., gender, race) and observing if the output changes. If it does, it signifies bias exists within the decision-making process. Iteratively, counterfactual fairness can suggest adjustments to the model or dataset to achieve fairer outcomes.

  • Adversarial Debiasing: In this technique, two models are trained against each other. The first is the main predictive model, while the second tries to identify bias (like predicting race or gender) from the first model's output. This adversarial setup encourages the predictive model to learn representations that mask sensitive characteristics, leading to less discriminatory results.

  • Regularization: This method adds constraints to the optimization process. By penalizing complexity, regularization can help prevent models from overly relying on biased correlations within the data, leading to more generalizable and fair predictions.

  • Human Oversight: Involving humans in critical decision processes.

Tools and libraries for bias detection and mitigation 

  • AI Fairness 360 (AIF360) by IBM: An extensible toolkit that provides algorithms and metrics to help researchers and developers detect, understand, and mitigate unwanted algorithmic biases in machine learning models → their GitHub

  • Fairlearn: A library that seeks to enable developers to assess and improve the fairness of their machine learning models. It provides mechanisms for mitigating unfairness in binary classification and regression → their GitHub

  • What-If Tool: An interactive visual interface designed by Google for probing the behavior of machine learning models. It's useful for investigating model performances on a dataset and can be used for bias detection → their GitHub

  • FAT Forensics: A Python toolbox that offers functionalities to evaluate the fairness, accountability, and transparency (FAT) of AI systems, including tools for data and model inspection → their page

  • Themis-ml: A library for fairness-aware machine learning, providing implementations of algorithmic fairness metrics and mitigation methods. It focuses on discriminatory impact as measured by disparity in error rates → their GitHub

  • FairTest: A tool for discovering unwarranted associations between an algorithm's outputs and the inputs it was trained on. FairTest enables developers to test their models for bias and discrimination → their GitHub

  • TensorFlow Fairness Indicators: An extension of TensorFlow Model Analysis that provides metrics and plots to evaluate model fairness. It helps in evaluating and improving model performance for fairness criteria → their page

    The best thing though is: Continuous Monitoring (read Token 1.18: How to Monitor LLMs?) Fairness in AI is not a one-time fix. Models need to be monitored for bias over their lifecycle as data distributions or societal dynamics might change.

Actions for Different Stakeholders (quick reminder)

  • For Developers: regularly audit datasets, employ bias mitigation techniques during training and build explainability into your models.

  • For Organizations: establish clear AI governance policies, conduct regular bias audits of deployed models, and invest in responsible AI training for employees.

  • For Individuals: be critical consumers of AI-generated content. And all content in general. Exercise your critical thinking every time you read anything. Choose trustworthy resources.

Conclusion

Emphasize the importance of an iterative approach to model development, where bias detection and mitigation are integrated at each stage of the process, from initial design to deployment and ongoing monitoring.

Resources:

  • Provide a list of resources where readers can learn more:

    • Research papers on bias in NLP

    • Websites of organizations dedicated to responsible AI

    • Online courses or tutorials on bias mitigation techniques

Conclusion – a continuous effort

Bias in foundation models stems from the data they learn from and the algorithms that guide them. It's a complex problem requiring constant attention. The fight against bias involves careful data analysis, seeking diverse input, promoting transparency, and employing debiasing techniques. Organizations and individuals alike must play a role in ensuring AI works fairly for everyone.

While a perfectly unbiased system might be impossible, the ongoing effort to mitigate bias is essential. By prioritizing ethical considerations, and transparency, and using tools to detect and address bias, we can strive for AI systems that work to benefit all of humanity fairly.

Combating bias in AI is an ongoing process. Stay informed, advocate for fairness, and support the development of responsible AI for a more just future.

Research papers:

Resourses from Turing Post

Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍

Reply

Avatar

or to participate

Keep Reading