December 05, 2024 | Read Online

Guest post: When Search Meets AI: The Hybrid Revolution*

Inside Milvus Vector Database's Intelligent Search Solution

As AI-powered search adoption increases, many organizations find that more than semantic and traditional full-text search alone can meet every requirement. Semantic search excels at understanding context and intent, while full-text search provides precise, predictable results with term-matching and rule-based scoring. To leverage the strengths of both, hybrid search has emerged as the ideal solution – combining the flexibility of semantic relevance with the precision of lexical matching. However, this approach often introduces operational complexity, particularly when relying on separate systems for each method.

The Hybrid Search Challenge: Managing Complexity with Multiple Systems

While highly effective, hybrid search often requires a complex setup involving two distinct systems. A vector database like Milvus is typically used for semantic search due to its scalability and efficiency, while traditional search engines such as Elasticsearch or OpenSearch handle full-text search. Each system is only responsible for its domain: the vector database provides semantic understanding through vector embeddings, while the other delivers scored term-matching results.

However, this dual-stack architecture introduces significant operational challenges. Managing two separate infrastructures involves maintaining duplicate data pipelines, configuring two distinct APIs, and replicating access control policies to both systems. This architectural approach increases administrative overhead, raises the risk of data inconsistencies, and complicates policy update workflows. It also increases the storage overhead, such as storing duplicate labels or tags on both systems to achieve metadata filtering during search. Separate query paths potentially increase client-side code’s search latency and complexity to fetch, merge, and rerank results across systems. This fragmented architecture not only hampers agility but also makes scaling costly.

Why Is Building a Unified Table Difficult?

Creating a single database to support both semantic and full-text search offers many benefits to solve the aforementioned hybrid search challenge. Ideally, all data – such as raw text, labels, tags, and indexes for both full-text and vector-based semantic search –would be stored in a unified table. However, achieving this is a complex challenge, whether for vector databases or traditional search engines like Elasticsearch, due to the dynamic nature of search data and the fundamental differences in how these search methods process information.

Vector databases have laid a solid foundation for hybrid search with their ability to efficiently store and retrieve vectors. To make it support full-text search, the most natural approach is to represent documents and queries as sparse vectors, where each dimension corresponds to the score of a token. Since a document typically contains only a small subset of the tokens in the vocabulary, most dimensions have a value of zero, hence the term “sparse.”

Here is an example of using a simple sparse vector approach to conduct full-text search.

Example:

Consider a corpus with only five words, where the dimensions represent:

1: hello, 2: world, 3: zilliz, 4: vector, etc.

Insert document:

• Document: "Hello world, world!"

• Tokenization: "hello", "world"

• Scoring (with TF and stats): [{"hello": 1.1}, {"world": 2.1}]

• Mapping tokens to sparse vector dimensions: [0: 1.1, 1: 2.1]

Search query:

• Query: "Hello, zilliz."

• Tokenization: "hello", "zilliz"

• Scoring (with IDF and stats): [{"hello": 0.2}, {"zilliz": 0.1}]

• Mapping tokens to sparse vector dimensions: [0: 0.2, 2: 0.1]

• Scoring documents: dot_product(query_vector, doc_vector)

Where:

• TF: Term Frequency

• IDF: Inverse Document Frequency

• Stats: Corpus statistics, such as the number of documents containing a token.

Most vector databases support BM25-based full-text search using this approach. However, a basic implementation has a critical flaw.

The Challenge of a Changing Corpus

As this example shows, the scoring of documents and queries relies on corpus statistics. In reality, the corpus is always changing as users add, update, and delete documents. Statistics used for scoring – such as the average document length, token frequencies, and document counts – are constantly in flux.

This creates a critical issue: either the corpus must remain static (no additions or deletions allowed), which is an unrealistic assumption, or the scores will become inaccurate over time.

Limitations of Pre-Trained Vocabularies

Additionally, some implementations use pre-trained vocabularies from public datasets, such as MS MARCO. However, this introduces another problem: words appearing in your documents, such as special terms or names, may not exist in the pre-trained vocabulary. This prevents the sparse vector representation from recognizing these terms, undermining one of the key strengths of full-text search – its ability to handle unique or domain-specific terms effectively.

What Makes Milvus Stand Out?

Simply relying on sparse vectors does not fully address the full-text search problem, necessitating a more innovative approach.

Milvus is the first vector database to tackle these challenges with a unified design, seamlessly integrating both semantic and full-text search capabilities into a single system. Milvus achieves hybrid search capabilities through its cutting-edge Sparse-BM25 implementation, which integrates full-text search into the vector database architecture. By representing term frequencies as sparse vectors instead of traditional inverted indexes, Sparse-BM25 enables advanced optimizations, such as graph indexing, product quantization (PQ), and scalar quantization (SQ). These optimizations minimize memory usage and accelerate search performance. Similar to the inverted index approach, Milvus supports taking raw text as input and automatically generating sparse vectors internally. This enables it to work with any tokenizer and grasp any word shown in the dynamically changing corpus.

Additionally, heuristic-based pruning to identify and remove low-value sparse vectors that contribute minimally to search quality, thereby enhancing system efficiency without compromising accuracy. Unlike the previous approach discussed earlier, Milvus’ implementation maintains BM25 scoring accuracy even as the corpora grows and evolves over time.

Built in C++ for optimal memory management and equipped with support for memory mapping (MMap) to scale large datasets, Milvus sets a new standard for hybrid search systems. It unifies semantic and full-text search within a single platform, redefining what is possible in the field of hybrid search.

This unique approach delivers full-text search quality comparable to established solutions like Elasticsearch and OpenSearch. At the same time, Milvus outperforms in semantic search, offering over 3x faster search performance and more than 10x faster index build times, as demonstrated in tests conducted on Zilliz Cloud (fully-managed Milvus) and Elastic Cloud (fully-managed Elasticsearch).

Conclusion: Redefining Hybrid Search with Milvus

The challenges of hybrid search balancing semantic relevance with precise keyword matching – are compounded by the complexities of managing dual systems. Milvus solves this problem with an innovative, unified approach integrating semantic and full-text search capabilities into a single, high-performance platform. Its Sparse-BM25 implementation combines the strengths of traditional and vector-based search and fundamentally reimagines how these technologies can work together to ensure optimal accuracy, efficiency, and scalability.

Milvus transforms the hybrid search landscape by eliminating the need for separate infrastructures and consolidating data pipelines. Organizations can now significantly reduce operational overhead, streamline development workflows, and achieve superior search performance while maintaining the highest standard for search quality. This unified approach delivers tangible benefits: dramatically reduced infrastructure costs, simplified maintenance, and enhanced data consistency.

As AI continues to reshape search technology, Milvus stands at the forefront of innovation, making sophisticated hybrid search accessible and practical for organizations of all sizes. Its ability to handle dynamic corpora, adapt to domain-specific vocabularies, and scale efficiently positions it as more than just a solution to current challenges – it's a blueprint for the future of search technology. In a world where search capabilities can make or break user experiences, Milvus offers a clear path forward, proving that the power of semantic and full-text search can be elegantly unified in a single, robust system.

*This post was written by Jiang Chen, the Head of AI Platform and Ecosystem at Zilliz, specially for Turing Post. We thank Zilliz for their insights and ongoing support of Turing Post.