Community
Last year, “hallucinations” produced by generative artificial intelligence (GenAI) were in the spotlight in the courtroom and all over the news. Bloomberg News reported that “Goldman Sachs Group Inc., Citigroup Inc., JPMorgan Chase & Co. and other Wall Street firms are warning investors about new risks from the increasing use of artificial intelligence, including software hallucinations, employee-morale issues, use by cybercriminals and the impact of changing laws globally.”
GenAI hallucinations are indeed problematic. For example, researchers at Stanford University last year found that general-purpose GenAI tools like ChatGPT have an error rate as high as 82% when used for legal purposes. GenAI tools purpose-built for law applications are better, producing hallucinations 17% percent of the time, according to a different Stanford study.
Regardless of the hallucination rate, the problem is further exacerbated, in any industry, by the human consuming the GenAI output: they may not notice the hallucination or care to validate the output, instead acting directly upon it.
Why Do GenAI Models Hallucinate?
Factors that can lead to GenAI hallucinations include:
Detecting hallucinations is difficult because LLM algorithms are not interpretable and do not provide visibility to justify their responses. Even if a RAG context was supposedly referenced in the response, you may find that wasn’t the case. Without knowing the answer, haphazardly relying on bad or biased statistics in LLMs to get a possible answer can be high risk.
How Can You Reduce GenAI Hallucinations?
Many organizations are already trying to customize pretrained LLMs to their purposes using fine-tuning techniques like Low-Rank Adaptation (LoRA). To reduce hallucinations, one needs to specify the domain and task data used to build large language models that will have less hallucinations as they are trained on data that is relevant to the use case.
There is also a need for additional models to monitor and minimize harm created by hallucinations. Enterprise policy should prioritize the process for how the output of these tools will be used in a business context and leverage a risk-based strategy to decide when and when not to use outputs, and how to set risk tolerance based on use case. Specially designed GenAI trust scores reflect the probability that the prompts and answers align with sanctioned answers. High trust scores mean little risk of hallucinations, low trust scores mean high risk of hallucinations. With a trust score you can set your risk tolerance and control the amount of hallucination and harm to your business while benefiting from the power of generative AI techniques.
Using Focused Language Models to Fight Hallucinations
The best approach to using GenAI responsibly in financial services starts with the concept of focused language models (FLMs). FLMs are small language models (SLMs) built on an expertly designed training data set both at the domain and task level — in other words, data from the context in which the final model will be used, such as risk management decisions in financial services. This results in superior accuracy, enhanced trust in output, and efficiency in production since smaller models result in less inferencing latency and cost.
FLM is a new concept that puts data science back into GenAI, and in a way that meets responsible AI principles. A fine level of specificity ensures the appropriate high quality and high relevance data is chosen; later, you can further train the model (‘task tuning’) to further ensure it’s correctly focused on the specific business objective at hand and that the outputs get operationalized in a business process.
The FLM approach is distinctly different from commercially available LLMs and SLMs, which offer no control of the data used to build the model. For enterprises this control of the pretraining and task training data is crucial for preventing hallucinations and harm. Complete control of this training data is a first necessary step in a Responsible AI use of these transformer models.
A focused language model enables GenAI to be used responsibly because:
Curious to learn more? Join me at the 7th AI in Financial Services conference in London on the 9th of September, where I will be discussing this topic in my presentation.
This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.
Srinivasa Atta Cloud & AI at Google
03 September
Alex Kreger Founder and CEO at UXDA Financial UX Design
Raktim Singh Senior Industry Principal at Infosys
02 September
Jonathan Frost Global Advisory, EMEA at BioCatch
Welcome to Finextra. We use cookies to help us to deliver our services. You may change your preferences at our Cookie Centre.
Please read our Privacy Policy.