📝 Guest Post: Stop Hallucinations From Hurting your LLM Powered Apps*
Was this email forwarded to you? Sign up here Large language model (LLM) hallucinations pose a big threat to the successful adoption of the new wave of LLM apps. In this post, the Galileo team dives into how one can prevent hallucinations from creeping in, as well as some metrics developed by the researchers at Galileo to quantify potential LLM hallucinations. They also introduce a free access to the Galileo LLM Studio, powered by research-backed mechanisms to combat LLM hallucinations. Leveraging large language models to build useful and interactive products has never been easier. This has sparked a new wave of ‘LLM developers’ to quickly use a combination of LLMs (Open source/Closed source), frameworks such as LangChain, LlamaIndex and LangFlow, and perhaps a vector database to spin up a LLM powered product — a large number of startups, mid-market businesses, as well as large enterprises have been racing in this direction. Given this, it becomes increasingly important to be mindful of a huge bottleneck in LLM adoption – hallucinations, aka models being over confidently incorrect in their predictions. We have already seen this crop up in Google’s Bard launch, Microsoft’s launch of the Tey chatbot a few years ago, and numerous similar examples recently – this poses not just a reputational risk to the business, but also a societal risk with disinformation dissemination and biased viewpoints proliferating. What are LLM hallucinations?LLMs hallucinate when their predictions are based on insufficient or inaccurate training data. For instance, GPT-4 was trained on data dating back to Sep 2021. When prompted with questions beyond that date, it has a high likelihood of a hallucinated response. A research paper from the Center for Artificial Intelligence Research (CAiRE) defines a hallucination from an LLM as “when the generated content is nonsensical or unfaithful to the provided source content”. A popular example of this is about GPT-4 being asked “Who owns Silicon Valley Bank?”. The response is “Silicon Valley Bank (SVB) is not owned by a single individual or entity. It is a subsidiary of SVB Financial Group, a publicly traded company (NASDAQ: SIVB). The ownership of SVB Financial Group is distributed among individual and institutional shareholders who own its stocks. As a publicly traded company, its ownership can change frequently as shares are bought and sold on the stock market.” In this case, GPT-4 has no idea about the recent SVB collapse. To mitigate disinformation from this ‘hallucinated’ response, OpenAI recently added the ‘As of my knowledge cutoff in September 2021,’ prefix ahead of such responses. Why LLM hallucinations occurLLMs are, at the end of the day, large neural networks that predict the next token in a sequence – this could be the next character, sub-word or word. In mathematical terms – given a sequence of tokens T1, T2, …, TN, , the LLM learns the probability distribution of the next token TN+1 conditioned on the previous tokens: P(T_{N+1}|T_{1},T_{2},…,T_{N}) There are two factors that can influence LLM hallucination a lot:
Quantifying LLM HallucinationsThe best ways to reduce LLM hallucinations are by
To take this a step further, the researchers at Galileo have come up with promising metrics to be used to quantify hallucination.
Introducing the Galileo LLM StudioTo build high performing LLM powered apps, requires careful debugging of prompts and the training data – the Galileo LLM Studio provides powerful tools to do just that, powered by research-backed mechanisms to combat LLM hallucinations – and it’s 100% free for the community to use.
ConclusionIf you are interested to try the Galileo LLM Studio – join the waitlist along with 1000s of developers building exciting LLM powered apps. The problem of model hallucinations poses a dire threat in the face of adopting LLMs in applications at scale for everyday use – by focusing on ways to quantify the problem, as well as baking in safeguards, we can build safer, more useful products for the world and truly unleash the power of LLMs. References & AcknowledgmentsThe calibration and building blocks of Galileo's LLM hallucination metric is the outcome of numerous techniques and experiments, with references to (but not limited by) the following papers and artifacts:
*This post was written by the Galileo team. We thank Galileo for their support of TheSequence.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
Edge 296: Inside OpenAI's Method to Use GPT-4 to Explain Neuron's Behaviors in GPT-2
Thursday, June 1, 2023
The technique is one of the first attempts to utilize LLMs as a explainability foundation.
The Sequence Chat: Rohan Taori on Stanford's Alpaca, Alpaca Farm and the Future of LLMs
Wednesday, May 31, 2023
Alpaca was one of the first open LLMs to incorporate instruction following capabilities. Now one of the project's main researchers shares his insights about modern LLMs.
Edge 295: Self-Instruct Models
Tuesday, May 30, 2023
What if LLMs could auto improve their own instruction following capabilities?
📝 Guest Post: How to build a responsible code LLM with crowdsourcing*
Monday, May 29, 2023
In this post Toloka showcases Human-in-the-Loop using StarCoder, a code LLM, as an example. They address PII risks by training a PII reduction model through crowdsourcing, employing strategies like
GPT-Microsoft
Sunday, May 28, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
You Might Also Like
Transformers are Eating Quantum
Sunday, November 24, 2024
DeepMind's AlphaQubit addresses one of the main challenges in quantum computing. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Retro Recomendo: Gift Ideas
Sunday, November 24, 2024
Recomendo - issue #438 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Kotlin Weekly #434
Sunday, November 24, 2024
ISSUE #434 24th of November 2024 Hi Kotliners! Next week is the last one to send a paper proposal for the KotlinConf. We hope to see you there next year. Announcements State of Kotlin Scripting 2024
Weekend Reading — More time to write
Sunday, November 24, 2024
More Time to Write A fully functional clock that ticks backwards, giving you more time to write. Tech Stuff Martijn Faassen (FWIW I don't know how to use any debugger other than console.log) People
🕹️ Retro Consoles Worth Collecting While You Still Can — Is Last Year's Flagship Phone Worth Your Money?
Saturday, November 23, 2024
Also: Best Outdoor Smart Plugs, and More! How-To Geek Logo November 23, 2024 Did You Know After the "flair" that servers wore—buttons and other adornments—was made the butt of a joke in the
JSK Daily for Nov 23, 2024
Saturday, November 23, 2024
JSK Daily for Nov 23, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component
Not Ready For The Camera 📸
Saturday, November 23, 2024
What (and who) video-based social media leaves out. Here's a version for your browser. Hunting for the end of the long tail • November 23, 2024 Not Ready For The Camera Why hasn't video
Daily Coding Problem: Problem #1617 [Easy]
Saturday, November 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. You are given an string representing the initial conditions of some dominoes.
Ranked | The Tallest and Shortest Countries, by Average Height 📏
Saturday, November 23, 2024
These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3
⚙️ Your own Personal AI Agent, for Everything
Saturday, November 23, 2024
November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital