📝 Guest Post: How to Enhance the Usefulness of Large Language Models*
Was this email forwarded to you? Sign up here In this guest post, Filip Haltmayer, a Software Engineer at Zilliz, explains how LangChain and Milvus can enhance the usefulness of Large Language Models (LLMs) by allowing for the storage and retrieval of relevant documents. By integrating Milvus, a vector database, with LangChain, LLMs can process more tokens and improve their conversational abilities. 1. What is LangChainLarge Language Models (LLMs) are the new tech wave. The giants, Google and Microsoft, are stepping into the Colosseum to fight in the AI battle and take the search crown. This colossal fight is just the beginning of the LLM breakthrough, and LangChain is here to help push it even further. LLMs alone offer good conversational abilities with limited memory and a tendency to hallucinate in their responses. An example of this hallucination is when you ask ChatGPT, "What are the difficulty levels in Read Dead Redemption"; ChatGPT will respond by stating that there are no difficulty settings. This answer is incorrect, as RDR has 3 set difficulty levels. Because the LLM generates answers using its weights, verifying information or providing sources is impossible.
This is where LangChain comes in. LangChain is a framework that allows you to chain together different computations and knowledge with LLMs to push LLM's usefulness even further. LangChain enables you to create domain-specific chatbots, action agents for specific computation, etc. (TheSequence explained LangChain in more detail here). 2. What role Milvus playsSo where does Milvus come into play? LLMs can only process a certain amount of tokens (around four characters) at a time, meaning LLMs cannot analyze a large document or collection of documents. To get around this, we can store all our documents in a database, search for only the relevant documents to the input question, and feed those documents into the LLM for answer generation. Milvus is perfect for this, as we can take advantage of semantic search to retrieve more relevant documents faster. We begin by taking all the documents we analyze and converting the texts to embeddings. A significant benefit is that we can use the original LLM to generate the embeddings, ultimately keeping the LLM's "thought process" in our search. With all the data embedded, we can store the (embedding and original text) alongside any metadata within Milvus. Then, when a query comes in, we can take the query text, embed it using the same model, search for the relevant texts, and feed them into our LLM to generate answers. Although this pipeline sounds simple, creating it can take a lot of work. Luckily, LangChain makes it easy to do such a pipeline and offers the VectorStore wrapper for vector databases. ## 3. Integration Here is the integration that I worked on between Milvus and LangChain.
As previously mentioned, integrating Milvus into LangChain involves extending the VectorStore class. To do this, we needed to implement a few key functions: The last stage of this pipeline is performing searches for relevant data. When a user sends in a question, the text and metadata filters are sent to the Milvus VectorStore. Using the same embedding function as before, Milvus embeds the question and performs a similarity search across its data. Milvus offers two types of searches as a VectorStore, a default one that returns the objects in their unmodified order and one where the max marginal relevance ordering is used. Max marginal relevance ordering works by finding the examples with the embeddings that have the greatest cosine similarity with the inputs and then iteratively adding them while penalizing them for closeness to already selected examples. Integrating Milvus into LangChain did come with some hiccups, the most prominent being that Milvus cannot handle JSON. The lack of JSON support made things difficult due to the nature of how the configurations are generated for Milvus. Currently, there are only two routes for using the Milvus VectorStore: creating a VectorStore on an already existing Milvus collection or creating a VectorStore based on the first document passed into Milvus. If the Collection already exists, the schema is set in stone. All data that gets added needs to follow the schema to a "t"; if any data is missing or malformed, the system will ignore the entire entry. Similarly, if any extra metadata is added to a Document after the Collection is created, that extra metadata will be ignored. This doesn't lend itself to a very adaptable system and results in having to do quite a bit of extra work cleaning up inputs and extra work in creating a new collection. Luckily, in version 2.3, Milvus will add the ability to store JSON, simplifying this integration and any further ones. Milvus 2.3 is currently in beta, so feel free to check it out and provide us with any feedback! 4. ConclusionThe result is a working memory and knowledge base for LLMs. Since the integration was merged, LangChain has changed a bit, bringing in the idea of Retrievers. Retrievers are a way to connect to external storage and are the route LangChain will go down. Due to the quick-moving nature of this project, the code found in the Milvus VectorStore could be cleaner, and the next steps will be to clean it up and also add this retriever functionality. As previously mentioned, Milvus cannot handle a dynamic schema, making dealing with premade collections and inserts that are missing data very difficult. Once Milvus supports JSON metadata, it will be time to return and redo this integration. Overall, working on this project has been a great experience. The community is friendly, and Harrison is very active and helpful. I am looking forward to seeing the heights that the LangChain project reaches. *This post was written by Filip Haltmayer, a Software Engineer at Zilliz. We thank Zilliz for their ongoing support of TheSequence.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
Edge 283: Federated Learning and Differential Privacy
Wednesday, April 19, 2023
Applying deferential privacy to federated learning(FL) scenarios, Meta AI's research and the best open source frameworks in this area.
Edge 281: Cross-Device Federated Learning
Tuesday, April 11, 2023
Cross device federated learning(FL), Google's work on FL with differential privacy and the FedLab framework
📝 Guest Post: Caching LLM Queries for Improved Performance and Cost Savings*
Monday, April 10, 2023
If you're looking for a way to improve the performance of your large language model (LLM) application while reducing costs, consider utilizing a semantic cache to store LLM responses. By caching
The LLama Effect: How an Accidental Leak Sparked a Series of Impressive Open Source Alternatives to ChatGPT
Sunday, April 9, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
📌 EVENT: Join us at LLMs in Production conference – the first of its kind
Saturday, April 8, 2023
How can you actually use LLMs in production? There are still so many questions. Cost. Latency. Trust. What are the real use cases? What are challenges in productionizing them? MLOps community decided
You Might Also Like
Daily Coding Problem: Problem #1618 [Easy]
Sunday, November 24, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Zillow. Let's define a "sevenish" number to be one which is either a power
PD#602 How Netflix Built Self-Healing System to Survive Concurrency Bug
Sunday, November 24, 2024
CPUs were dying, the bug was temporarily un-fixable, and they had no viable path forward
RD#602 What are React Portals?
Sunday, November 24, 2024
A powerful feature that allows rendering components outside their parent component's DOM hierarchy
C#533 What's new in C# 13
Sunday, November 24, 2024
Params collections support, a new Lock type and others
⚙️ Smaller but deeper: Writer’s secret weapon to better AI
Sunday, November 24, 2024
November 24, 2024 | Read Online Ian Krietzberg Good morning. I sat down recently with Waseem Alshikh, the co-founder and CTO of enterprise AI firm Writer. Writer recently made waves with the release of
Sunday Digest | Featuring 'How Often People Go to the Doctor, by Country' 📊
Sunday, November 24, 2024
Every visualization published this week, in one place. Nov 24, 2024 | View Online | Subscribe | VC+ | Download Our App Hello, welcome to your Sunday Digest. This week we visualized the GDP per capita
Android Weekly #650 🤖
Sunday, November 24, 2024
View in web browser 650 November 24th, 2024 Articles & Tutorials Sponsored Why your mobile releases are a black box “What's the status of the release?” Who knows. Uncover the unseen challenges
PHP 8.4 is released, Dynamic Mailer Configuration, and more! - №540
Sunday, November 24, 2024
Your Laravel week in review ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Lumoz RaaS Introduces Layer 2 Solution on Move Ecosystem
Sunday, November 24, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 24, 2024? The HackerNoon
😼 The hottest new AI engineer
Sunday, November 24, 2024
Plus, an uncheatable tech screen app Product Hunt Sunday, Nov 24 The Roundup This newsletter was brought to you by Countly Happy Sunday! Welcome back to another edition of The Roundup, folks. We've