📝 Guest Post: Do We Still Need Vector Databases for RAG with OpenAI's Built-In Retrieval?
Was this email forwarded to you? Sign up here In this guest post, Jael Gu, an algorithm engineer at Zilliz, will delve into the constraints of OpenAI's built-in retrieval and walk you through creating a customized retriever using Milvus, an open-source vector database. OpenAI once again made headlines with a slew of releases during its DevDay, unveiling the GPT-4 Turbo model, the new Assistants API, and a range of enhancements. The Assistants API emerges as a powerful tool helping developers craft bespoke AI applications catering to specific needs. It also allows them to tap into additional knowledge, longer prompt length, and tools for various tasks. While OpenAI Assistants come with an integrated retrieval feature, it's not perfect — think restrictions on data scale and lack of capability in customization. This is precisely where a custom retriever steps in to help. By harnessing OpenAI's function calling capabilities, developers can seamlessly integrate a customized retriever, elevating the scale of the additional knowledge and better fitting diverse use cases. Let’s dive in! Limitations of OpenAI’s retrieval in Assistants and the role of custom retrieval solutionsOpenAI's built-in Retrieval feature represents a leap beyond the model's inherent knowledge, enabling users to augment it with extra data such as proprietary product information or user-provided documents. However, it grapples with notable limitations. Scalability constraintOpenAI Retrieval imposes file and total storage constraints that might fall short for extensive document repositories:
For organizations with extensive data repositories, these limitations pose challenges. A scalable solution that grows seamlessly without hitting storage ceilings becomes imperative. Integrating a custom retriever powered by a vector database like Milvus or Zilliz Cloud (the managed Milvus) offers a workaround for the file limitations inherent in OpenAI's built-in Retrieval. Lack of customizationWhile OpenAI's Retrieval offers a convenient out-of-the-box solution, it cannot consistently align with every application's specific needs, especially regarding latency and search algorithm customization. Utilizing a third-party vector database grants developers the flexibility to optimize and configure the retrieval process, catering to production needs and enhancing overall efficiency. Lack of multi-tenancyRetrieval is a built-in feature in OpenAI Assistants that only supports individual user usage. However, if you are a developer aiming to serve millions of users with both shared documents and users' private information, the built-in retrieval feature cannot help. Replicating shared documents to each user's Assistant escalates storage costs, while having all users share the same Assistant poses challenges in supporting user-specific private documents. The following graph shows that storing documents in OpenAI Assistants is expensive ($6 per GB per month; for reference, AWS S3 charges $0.023), making storing duplicate documents on OpenAI incredibly wasteful. For organizations harboring extensive datasets, a scalable, efficient, and cost-effective retriever that aligns with specific operational demands is imperative. Fortunately, with OpenAI's flexible function calling capability, developers can seamlessly integrate a custom retriever with OpenAI Assistants. This solution ensures businesses can harness the best AI capabilities powered by OpenAI while maintaining scalability and flexibility for their unique needs. Leveraging Milvus for customized OpenAI retrievalMilvus is an open-source vector database that can store and retrieve billions of vectors within milliseconds. It is also highly scalable to meet users’ rapidly growing business needs. With rapid scaling and ultra-low latency, the Milvus vector database is among the top choices for building a highly scalable and more efficient retriever for your OpenAI assistant. Building a custom retriever with OpenAI function calling and Milvus vector databaseLet’s start building the custom retriever and integrate it with OpenAI by following the step-by-step guide.
|
Older messages
Gemini and Mistral MoE: Both Impactul Altough Very Different Releases
Sunday, December 10, 2023
Next Week in The Sequence: Edge 351: Presents a detailed summary of our series about fine-tuning in foundation models. Edge 352: Will dive into LinkedIn's embedding architecure that power its
📝 Guest Post: How to Maximize LLM Performance*
Friday, December 8, 2023
In this post, Jordan Burgess, co-founder and Chief Product Officer at Humanloop, discusses the techniques for going from an initial demo to a robust production-ready application and explain how tools
Meet Zephyr: How Hugging Face's Instruction Fine Tuned LLM Outperformed Models 10 Times Its Size
Thursday, December 7, 2023
A fine-tuned version of Mistral, Zephyr applied some very clever techniques that led it to outperform LLaMA 70B and other much larger models.
Edge 349: Reinforcement Learning with AI Feedback
Tuesday, December 5, 2023
One of the most promising techniques that uses feedback from AI agents to fine tune foundation models.
📹 [Webinar] Building a Real-Time Fraud Detection System at Signifyd
Monday, December 4, 2023
Fraudsters are always evolving their tactics, such as using AI and LLMs, to bypass detection. To combat fraud, Signifyd, an e-commerce fraud detection platform, uses ML to make instantaneous decisions
You Might Also Like
Daily Coding Problem: Problem #1617 [Easy]
Saturday, November 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. You are given an string representing the initial conditions of some dominoes.
Ranked | The Tallest and Shortest Countries, by Average Height 📏
Saturday, November 23, 2024
These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3
⚙️ Your own Personal AI Agent, for Everything
Saturday, November 23, 2024
November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital
Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?
Saturday, November 23, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon
🐍 New Python tutorials on Real Python
Saturday, November 23, 2024
Hey there, There's always something going on over at Real Python as far as Python tutorials go. Here's what you may have missed this past week: Black Friday Giveaway @ Real Python This Black
Re: Hackers may have stolen everyone's SSN!
Saturday, November 23, 2024
I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for
North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn
Saturday, November 23, 2024
THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024
📧 Building Async APIs in ASP.NET Core - The Right Way
Saturday, November 23, 2024
Building Async APIs in ASP .NET Core - The Right Way Read on: my website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a
WebAIM November 2024 Newsletter
Friday, November 22, 2024
WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to
➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux
Friday, November 22, 2024
Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and