• Turing Post
  • Posts
  • Token 1.20: Explainable AI techniques and tools for LLMs

Token 1.20: Explainable AI techniques and tools for LLMs

that will help you decipher these models and their predictions

Introduction

Machine learning (ML) explainability is a hot topic these days, particularly in the era of foundation models (FMs)/large language models (LLMs). The increasing emphasis on ML explainability reflects a broader recognition of the need to align AI technologies with human values, ethical principles, and societal norms. As AI systems become more embedded in everyday life, ensuring they operate transparently and justifiably becomes more of a societal imperative. This trend is likely to continue, with explainability playing a key role in the evolution of AI technologies and their integration into diverse facets of human activity.

In today’s Token, we will cover:

  • Why should I bother with Explainability? Trust. Debugging. Ethical Considerations and Bias. Interdisciplinary Collaboration. Regulations and Compliance.

  • How can I explain my models? Surrogate Models. Attention Visualization. Natural Language Explanation. Prompt Based Technique.

  • How can I implement these techniques? BertViz. Captum. LIME. CHAP.

  • Conclusion.

In this article, you will learn why you need to focus on model explainability followed by useful techniques and tools that will help you decipher these models and their predictions.

So, why should I bother with ML explainability?

You could be investing this time in improving the performance metrics for your model, optimizing your model so that you can serve your predictions faster, or just talking about your favorite Marvel character with your colleague. But, why on Earth should you be reading about explainability when your ML model is already, let’s say, 98% accurate? Actually, there are a few strong reasons for you to focus on explainability. Let us go through them.

Trust

Machine learning models especially LLMs are extremely convoluted and hence are opaque to users. Unlike simpler models where decisions can be traced back to specific features or rules, LLMs work through intricate networks of neurons and weights, making their decision-making process a black box*. But AI and ML models are becoming ubiquitous in our lives, and for them to be fully embraced by businesses and the public, people need to trust them. Explainability bridges the gap between human understanding and machine logic, making these systems more approachable and trustworthy. When users understand how a model arrives at its conclusions, they're more likely to trust its decisions and, by extension, the organizations that deploy them. Explainable AI helps users understand how the model works, its logic behind the decisions made and conclusions reached, and encourages their adoption.

*A black box in AI refers to a system whose inner workings are opaque, making its decision-making process unclear and unexplained.

Debugging

LLMs are complex and can sometimes produce unexpected or erroneous outputs. Debugging LLMs can be challenging, especially when you have no idea how they arrived at a particular conclusion or output. When you discover why the model reached a particular conclusion using explainable AI tools, it becomes easier for you to tweak the dataset, the training procedure, or to fine-tune it so that the model's outputs are closer to the desired outcome. Furthermore, explainable AI tools can help you examine the source of bias in the model. Once identified, you can then take steps to eliminate those biases.

Ethical Considerations and Bias

As ML models, especially LLMs, are deployed in more critical applications –from healthcare to legal and financial services – the potential for biased outcomes or ethical mishaps increases. There's a growing recognition that these models can inadvertently perpetuate or even exacerbate biases present in their training data. Explainability allows stakeholders to identify and mitigate these biases, ensuring that AI systems act in an ethical and fair manner.

Interdisciplinary Collaboration

Explainability isn't just about mitigating risks; it's also about unlocking new potentials. The push for explainability fosters collaboration across fields –combining the expertise of data scientists, ethicists, legal experts, and domain specialists. This interdisciplinary approach enriches the development and deployment of AI systems, ensuring they meet a wider range of needs and considerations and encouraging innovative approaches.

Regulations and Compliance

Lastly, the power of LLMs has made them central to automation systems in critical industries such as healthcare and medicine, journalism, banking, and insurance. Regulations in certain parts of the world (GDPR in Europe) require or may soon require explainability to establish transparency, fairness and accountability.

Thus, the explainability of ML models, especially LLMs, is crucial for the wide adoption and smooth functioning of these models in production.

How can I explain my models?

While explainable AI is a fairly old topic, it is still an area of active research. And, with the rise of LLMs and ethical concerns associated with them, its popularity has skyrocketed. With conventional models like linear regression and tree-based models, explaining them was effortless compared to the LLMs. For instance, in a linear model, the coefficient of each feature determined its significance and helped in explaining the model output.

Image Credit: Zelros AI

For tree-based models like decision trees, random forest, xgboost, etc. you can compute feature importance which gives you an idea of how important the feature is to the trained model. It is computed by calculating the drop in error/loss after using the feature.

Meanwhile, for LLMs, explaining the model is not so straightforward. Below, we will discuss a few explainability techniques that can be employed for LLMs. Worth knowing!

The rest of this article, loaded with useful details, is available to our Premium users only. As our St. Valentine’s gift to you (🫶), for only three days, subscribe with 30% OFF –>

Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍

Please give us feedback

Login or Subscribe to participate in polls.

Previously in the FM/LLM series:

Join the conversation

or to participate.