This website uses cookies

Read our Privacy policy and Terms of use for more information.

To set the right tone for 2024, we decided to start with hallucinations 😉

With the rise of interest in foundation models (FMs) that are extraordinarily impressive at producing data across various modalities – text, images, video, and audio – another phenomenon has appeared: Hallucinations. Though the term anthropomorphizes the algorithms, it has become widely accepted in both the business and academic sectors.

In this article, we want to understand what causes them and how to deal with them, as well as provide you with some ideas on how hallucinations can be beneficial. Let's immerse ourselves in our first deep dive of this interesting year!

  • What exactly are hallucinations in the context of foundation models?

  • Why do hallucinations occur?

  • Strategies and methods for identifying when a model is hallucinating.

  • Why hallucinations are generally problematic, but can be beneficial.

  • Ways to reduce or possibly eliminate hallucinations.

  • Bonus Resources: A curated list of datasets, libraries, and tools for dealing with hallucinations.

As most research and literature on hallucinations currently focus on text-based models – Large Language Models aka LLMs – we also center our article around this subset of foundation models.

What are AI hallucinations in foundation models?

In LLM models, "hallucination" refers to instances where the model generates content that isn't based on real or accurate information. This can happen when a model produces text with details, facts, or claims that are fictional, misleading, or completely made up, instead of giving reliable and truthful information.

Hallucination cases in LLMs can be broken down into the following categories:

  • Input-Conflicting Hallucination: This occurs when what the model produces doesn't match what was put in. Imagine you give the model some facts, but it changes them in its output. That's a classic case of this type of hallucination.

  • Context-Conflicting Hallucination: Here, the model gets its wires crossed during longer conversations or multiple exchanges. It might lose track of what's being discussed or contradict something it said earlier.

  • Fact-Conflicting Hallucination: This is when the model says something that clashes with well-known facts or general knowledge.

The latest survey on hallucinations in LLMs published in November 2023, suggests a simplified classification uniting input-conflicting and context-conflicting hallucinations into faithfulness hallucination and names fact-conflicting hallucination factuality hallucination. We adopt this simplified notation for clarity.

Most of the time, when people talk about "hallucinations" in LLMs, they're thinking of the factuality hallucination. The focus in recent research has been on this type because it brings up trickier issues in these models, like not having a reliable source to check facts against. Plus, these kinds of hallucinations can mess with how well LLMs work in real life. How trustworthy they are.

When talking about vision-language models, you may also encounter the term “object hallucination.” It's mainly used to describe cases where a model creates text that makes sense on its own but doesn't match the actual objects in an image. This term is discussed in detail in the paper “Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding.”

Which broader topic do hallucinations fall into?

Hallucinations in LLMs fit within the broader scope of AI ethics and safety. Alongside hallucinations, there are various other challenges commonly faced in LLMs, which are outlined in the table below:

Image Source: Siren’s Song in the AI Ocean

Why do AI hallucinations occur?

When we think about open-ended generation tasks for a model, it inherently requires the model to create, or in a way, 'hallucinate' the output. However, these hallucinations often lack factual correctness due to how the model learns relationships from its data.

Internet data

To give a bit of context on the training of these models: they are fed massive amounts of unlabeled data, typically sourced from the Internet. Considering the volume of misinformation: a significant amount of fabricated, outdated, or biased information, and imaginative content online, it's clear why these models might struggle to differentiate fact from fiction. This brings us to the challenge of teaching models to understand the real world, separating fact from fiction while retaining the richness of imaginative language and imagery found in stories.

The goal of the model

The main goal of LLMs is to use probabilistic distributions to generate meaningful responses to prompts. It's important to understand that these models aren't built to verify their outputs against real-world facts. Instead, they aim to weave together the information they've been exposed to, creating a coherent narrative. In doing so, there are instances where they might extend beyond actual real-world knowledge to fulfill their purpose.

Yann LeCun, Chief AI Scientist at Meta, wrote: “Hallucinations in LLM are due to the Auto-Regressive prediction”:

Problems with identifying relevant information and long context

Faithfulness hallucinations arise from the model's struggle to navigate and prioritize information within a lengthy input sequence. This challenge has been the focus of several studies. For instance, in the paper Large Language Models Can Be Easily Distracted by Irrelevant Context, researchers experimented by ingesting the input with irrelevant details. They discovered that all the prompting techniques they tested were vulnerable to this extraneous information. Specifically, when irrelevant details were added to the problem description, the LLM's ability to solve the original problems plummeted dramatically, indicating that even a small amount of irrelevant information could distract the model and lead to inconsistent predictions.

Image Source: LLMs Can Be Easily Distracted by Irrelevant Context

Another study, Lost in the Middle: How Language Models Use Long Contexts, researched how LLMs manage to identify pertinent information amidst lengthy inputs. The findings revealed a significant drop in performance when the position of relevant details was altered. This suggests that current models don't consistently handle information in extended contexts. They also noticed that performance often peaks when the crucial information is placed at the beginning or end of the input. However, it deteriorates markedly when models need to access important details buried in the middle of long contexts, even when the models are specifically designed for long-context tasks.

Image Source: Lost in the Middle

Bad prompt engineering

Focusing on task instruction hallucination, an example is when a user requests one task, but the model completes another. Often, this is attributed to 'bad prompt engineering' – meaning the prompt is unclear or contains errors. Essentially, this issue arises when the model's instructions aren't crafted carefully.

Lack of objective alignment

LLMs are known for their versatility across different tasks, domains, and languages, which often implies that they possess a broad but surface-level understanding. However, when faced with specialized queries within a specific domain, these models might not have the depth of relevant information or expertise required. Consequently, in their attempt to respond to user queries, they may resort to hallucinating.

These are just several general causes of hallucinations. The reasons behind hallucinations in LLMs are manifold, with potential sources scattered across different stages of the LLM lifecycle. To explore this topic in greater depth, we recommend the thorough classification offered in the recent publication, A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. For a visual understanding of the complexity involved, a schematic from the survey is included below, illustrating the multifaceted aspects of this issue.

How to detect when a model is hallucinating

Detecting hallucinations in LLMs varies by type. For factuality hallucinations, one common method involves cross-referencing the LLM's outputs with a database of fact-checked information. This requires building and maintaining a reliable fact database to compare against the model's statements. Another approach is uncertainty estimation, which doesn't rely on external data. Instead, it quantifies how confident the model is in its predictions, offering a probabilistic measure or confidence score to assess the reliability of the LLM's outputs. Here are some examples:

  • SelfCheckGPT method operates in a zero-resource, black-box manner meaning it does not rely on additional annotated data to make its assessments. SelfCheckGPT leverages the simple idea that if an LLM knows a given concept, sampled responses are likely to be similar and contain consistent facts. However, for hallucinated facts, stochastically sampled responses are likely to diverge and contradict one another.

  • Self-Contradictory Hallucinations detect hallucinations by finding when the LLM generates text that is inconsistent within itself.

For faithfulness hallucination, the methods include:

  • Fact-based Metrics assess the overlap of key facts between the generated content and the source.

  • Classifier-based Metrics use classifiers trained on a mix of task-specific hallucinated and faithful content, including related task data or synthetically generated data.

  • Question-Answering-based Metrics select target answers from the LLM output and generate questions to create source answers. Faithfulness is measured by comparing matching scores between source and target answers.

  • Uncertainty Estimation links hallucinations to high model uncertainty. It involves Bayesian deep learning methods to characterize prediction uncertainty.

  • Prompting-based Metrics provide models with guidelines and both model-generated and source content for assessment. The evaluation output can be binary or a k-point Likert scale of faithfulness.

We will list all the latest research surveys that link to concrete frameworks, metrics, and benchmarks in the bonus section.

The Dual Nature of Hallucinations: Why hallucinations are generally problematic, but can be beneficial

AI hallucination can lead to significant consequences in real-world applications. For instance, a healthcare AI model might mistakenly classify a benign skin lesion as malignant, prompting unnecessary medical procedures. Additionally, AI hallucinations can contribute to the spread of misinformation.

Sebastian Berns, a Ph.D. researcher at Queen Mary University of London, suggests that hallucinating models might fuel creativity by serving as a “co-creative partner.” This creative use of hallucination could lead to unexpected results or novel idea combinations that might not naturally occur to most people.

Echoing this sentiment, Bindu Reddy, CEO of Abacus AI, remarked in late December 2023 that this characteristic of LLMs might prove to be more beneficial than not. As she puts it, “Dreams, ours or the LLM’s, are generally a good thing!”

Mitigating Hallucination Effects: Ways to reduce or possibly eliminate hallucinations

Addressing hallucinations in LLMs is crucial, especially in fields where factual accuracy is paramount, such as journalism, healthcare, and legal sectors. This discussion offers a brief overview of general recommendations and methods to mitigate hallucinations, providing a foundation for readers to explore more specific techniques in the bonus section below.

  • Use High-Quality Training Data: The quality and relevance of training datasets are vital for generative AI models. To reduce hallucinations, it's important to train AI models on diverse, balanced, and well-structured data. It helps minimize output bias and enhances the model's understanding and effectiveness.

  • Reduce the Model's Knowledge Limits: LLMs have inherent knowledge boundaries based on their training data. Retrieval-augmented generation (RAG) is a renowned method for expanding these limits by supplementing model responses with additional information from external sources.

  • Domain-Specific Fine-Tuning: Fine-tuning teaches a model new knowledge while retaining its existing skills, particularly useful in natural language processing tasks. By re-training the model with domain-specific data, it becomes less prone to generating plausible but inaccurate responses. For a practitioner's perspective on how fine-tuning eliminates hallucinations in production, see our interview with Sharon Zhou from Lamini

  • Reducing Sycophancy: Implement methods that reduce hallucinations during alignment stages, such as those performed using Reinforcement Learning from Human Feedback (RLHF).

  • Improving Context and Logical Consistency: Decoding strategies that emphasize context consistency enhance the faithfulness of LLMs to both user instructions and provided context. Ensuring logical consistency is crucial for maintaining consistent responses and preventing hallucinations during multi-step reasoning.

  • Limit Possible Outcomes: Regularization techniques can be employed to limit the range of a model's predictions, helping to avoid overfitting and incorrect predictions.

  • Direct Guidance to AI Models: Provide clear feedback to AI models about preferred and undesirable outputs. This helps the model learn and align with specific requirements.

  • One-Shot and Few-Shot Prompting: These methods influence the model’s output by restricting response length or providing demonstrations. One-shot prompting frames the prompt succinctly to reduce hallucinations, while few-shot prompting provides examples for the model to follow.

  • Reasoning also helps LLMs to avoid hallucinations. “Chain-of-thought prompting” is the way to go. The technique consists of modifying the original few-shot prompting by adding examples of problems and their solutions and a detailed description of intermediate reasoning steps while describing the solution.

These strategies might significantly improve the reliability and accuracy of LLMs in various applications!

Bonus Resources: A curated list of surveys with datasets, libraries, and tools for dealing with hallucinations

Conclusion

Hallucinations in foundation models (FMs) represent a critical challenge and opportunity within AI. While they can undermine trust and accuracy, particularly in text-based models, they also offer a unique avenue for creativity. Addressing them involves a combination of detecting strategies, from fact-checking against databases to advanced uncertainty estimation, and mitigation techniques like high-quality training data and domain-specific fine-tuning. Understanding and managing these hallucinations is essential for the responsible and effective deployment of AI in real-world scenarios, balancing their innovative potential with the imperative for factual integrity.

That’s a fun video to watch:

Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍

Previously in the FM/LLM series:

Resources From Turing Post:

Reply

Avatar

or to participate

Keep Reading