📝 Guest Post: Stop Hallucinations From Hurting your LLM Powered Apps*
Was this email forwarded to you? Sign up here Large language model (LLM) hallucinations pose a big threat to the successful adoption of the new wave of LLM apps. In this post, the Galileo team dives into how one can prevent hallucinations from creeping in, as well as some metrics developed by the researchers at Galileo to quantify potential LLM hallucinations. They also introduce a free access to the Galileo LLM Studio, powered by research-backed mechanisms to combat LLM hallucinations. Leveraging large language models to build useful and interactive products has never been easier. This has sparked a new wave of ‘LLM developers’ to quickly use a combination of LLMs (Open source/Closed source), frameworks such as LangChain, LlamaIndex and LangFlow, and perhaps a vector database to spin up a LLM powered product — a large number of startups, mid-market businesses, as well as large enterprises have been racing in this direction. Given this, it becomes increasingly important to be mindful of a huge bottleneck in LLM adoption – hallucinations, aka models being over confidently incorrect in their predictions. We have already seen this crop up in Google’s Bard launch, Microsoft’s launch of the Tey chatbot a few years ago, and numerous similar examples recently – this poses not just a reputational risk to the business, but also a societal risk with disinformation dissemination and biased viewpoints proliferating. What are LLM hallucinations?LLMs hallucinate when their predictions are based on insufficient or inaccurate training data. For instance, GPT-4 was trained on data dating back to Sep 2021. When prompted with questions beyond that date, it has a high likelihood of a hallucinated response. A research paper from the Center for Artificial Intelligence Research (CAiRE) defines a hallucination from an LLM as “when the generated content is nonsensical or unfaithful to the provided source content”. A popular example of this is about GPT-4 being asked “Who owns Silicon Valley Bank?”. The response is “Silicon Valley Bank (SVB) is not owned by a single individual or entity. It is a subsidiary of SVB Financial Group, a publicly traded company (NASDAQ: SIVB). The ownership of SVB Financial Group is distributed among individual and institutional shareholders who own its stocks. As a publicly traded company, its ownership can change frequently as shares are bought and sold on the stock market.” In this case, GPT-4 has no idea about the recent SVB collapse. To mitigate disinformation from this ‘hallucinated’ response, OpenAI recently added the ‘As of my knowledge cutoff in September 2021,’ prefix ahead of such responses. Why LLM hallucinations occurLLMs are, at the end of the day, large neural networks that predict the next token in a sequence – this could be the next character, sub-word or word. In mathematical terms – given a sequence of tokens T1, T2, …, TN, , the LLM learns the probability distribution of the next token TN+1 conditioned on the previous tokens: P(T_{N+1}|T_{1},T_{2},…,T_{N}) There are two factors that can influence LLM hallucination a lot:
Quantifying LLM HallucinationsThe best ways to reduce LLM hallucinations are by
To take this a step further, the researchers at Galileo have come up with promising metrics to be used to quantify hallucination.
Introducing the Galileo LLM StudioTo build high performing LLM powered apps, requires careful debugging of prompts and the training data – the Galileo LLM Studio provides powerful tools to do just that, powered by research-backed mechanisms to combat LLM hallucinations – and it’s 100% free for the community to use.
ConclusionIf you are interested to try the Galileo LLM Studio – join the waitlist along with 1000s of developers building exciting LLM powered apps. The problem of model hallucinations poses a dire threat in the face of adopting LLMs in applications at scale for everyday use – by focusing on ways to quantify the problem, as well as baking in safeguards, we can build safer, more useful products for the world and truly unleash the power of LLMs. References & AcknowledgmentsThe calibration and building blocks of Galileo's LLM hallucination metric is the outcome of numerous techniques and experiments, with references to (but not limited by) the following papers and artifacts:
*This post was written by the Galileo team. We thank Galileo for their support of TheSequence.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Key phrases
Older messages
Edge 296: Inside OpenAI's Method to Use GPT-4 to Explain Neuron's Behaviors in GPT-2
Thursday, June 1, 2023
The technique is one of the first attempts to utilize LLMs as a explainability foundation.
The Sequence Chat: Rohan Taori on Stanford's Alpaca, Alpaca Farm and the Future of LLMs
Wednesday, May 31, 2023
Alpaca was one of the first open LLMs to incorporate instruction following capabilities. Now one of the project's main researchers shares his insights about modern LLMs.
Edge 295: Self-Instruct Models
Tuesday, May 30, 2023
What if LLMs could auto improve their own instruction following capabilities?
📝 Guest Post: How to build a responsible code LLM with crowdsourcing*
Monday, May 29, 2023
In this post Toloka showcases Human-in-the-Loop using StarCoder, a code LLM, as an example. They address PII risks by training a PII reduction model through crowdsourcing, employing strategies like
GPT-Microsoft
Sunday, May 28, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
You Might Also Like
SBF gets 25 years
Thursday, March 28, 2024
Sam Bankman-Fried is sentenced View this email online in your browser By Christine Hall Thursday, March 28, 2024 Welcome back to TechCrunch PM! The editorial team spent a chunk of the day discussing
💎 Issue 410 - Being laid off in 2023-2024 as an early-career developer
Thursday, March 28, 2024
This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 410 Release Date Mar 28, 2024 Your weekly report of the most popular Ruby news, articles and
💻 Issue 403 - Microsoft defends .NET 9 features competing with open source ecosystem
Thursday, March 28, 2024
This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 403 Release Date Mar 28, 2024 Your weekly report of the most popular .NET news, articles and projects
💻 Issue 410 - Node.js TSC Confirms: No Intention to Remove npm from Distribution
Thursday, March 28, 2024
This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 410 Release Date Mar 28, 2024 Your weekly report of the most popular Node.js news, articles and
💻 Issue 410 - JSDoc as an alternative TypeScript syntax
Thursday, March 28, 2024
This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 410 Release Date Mar 28, 2024 Your weekly report of the most popular JavaScript news, articles
📱 Issue 404 - Dependency Injection for Modern Swift Applications Part II
Thursday, March 28, 2024
This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 404 Release Date Mar 28, 2024 Your weekly report of the most popular iOS news, articles and projects Popular
💻 Issue 328 - My new open-source repository to schedule all your content!
Thursday, March 28, 2024
This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 328 Release Date Mar 28, 2024 Your weekly report of the most popular React news, articles and projects
📱 Issue 407 - Apple just announced WWDC24. The keynote for WWDC24 will be held on Monday, June 10th.
Thursday, March 28, 2024
This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 407 Release Date Mar 28, 2024 Your weekly report of the most popular Swift news, articles and projects
💻 Issue 405 - 2024 Edition Update
Thursday, March 28, 2024
This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 405 Release Date Mar 28, 2024 Your weekly report of the most popular Rust news, articles and projects
🤖 What to Expect From Google I/O 2024 — How to Stop Apps From Leaking Your Data
Thursday, March 28, 2024
Also: The Best Camera Straps of 2024, and More! How-To Geek Logo March 28, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your inbox by