Skip to main content
Login
      • Discover
        • For Executives
          • For Startups
            • Lakehouse Architecture
              • Mosaic Research
              • Customers
                • Customer Stories
                • Partners
                  • Cloud Providers
                    Databricks on AWS, Azure, GCP, and SAP
                    • Consulting & System Integrators
                      Experts to build, deploy and migrate to Databricks
                      • Technology Partners
                        Connect your existing tools to your Lakehouse
                        • C&SI Partner Program
                          Build, deploy or migrate to the Lakehouse
                          • Data Partners
                            Access the ecosystem of data consumers
                            • Partner Solutions
                              Find custom industry and migration solutions
                              • Built on Databricks
                                Build, market and grow your business
                              • Databricks Platform
                                • Platform Overview
                                  A unified platform for data, analytics and AI
                                  • Data Management
                                    Data reliability, security and performance
                                    • Sharing
                                      An open, secure, zero-copy sharing for all data
                                      • Data Warehousing
                                        Serverless data warehouse for SQL analytics
                                        • Governance
                                          Unified governance for all data, analytics and AI assets
                                          • Real-Time Analytics
                                            Real-time analytics, AI and applications made simple
                                            • Artificial Intelligence
                                              Build and deploy ML and GenAI applications
                                              • Data Engineering
                                                ETL and orchestration for batch and streaming data
                                                • Business Intelligence
                                                  Intelligent analytics for real-world data
                                                  • Data Science
                                                    Collaborative data science at scale
                                                  • Integrations and Data
                                                    • Marketplace
                                                      Open marketplace for data, analytics and AI
                                                      • IDE Integrations
                                                        Build on the Lakehouse in your favorite IDE
                                                        • Partner Connect
                                                          Discover and integrate with the Databricks ecosystem
                                                        • Pricing
                                                          • Databricks Pricing
                                                            Explore product pricing, DBUs and more
                                                            • Cost Calculator
                                                              Estimate your compute costs on any cloud
                                                            • Open Source
                                                              • Open Source Technologies
                                                                Learn more about the innovations behind the platform
                                                              • Databricks for Industries
                                                                • Communications
                                                                  • Media and Entertainment
                                                                    • Financial Services
                                                                      • Public Sector
                                                                        • Healthcare & Life Sciences
                                                                          • Retail
                                                                            • Manufacturing
                                                                              • See All Industries
                                                                              • Cross Industry Solutions
                                                                                • Cybersecurity
                                                                                  • Marketing
                                                                                  • Migration & Deployment
                                                                                    • Data Migration
                                                                                      • Professional Services
                                                                                      • Solution Accelerators
                                                                                        • Explore Accelerators
                                                                                          Move faster toward outcomes that matter
                                                                                        • Training and Certification
                                                                                          • Learning Overview
                                                                                            Hub for training, certification, events and more
                                                                                            • Training Overview
                                                                                              Discover curriculum tailored to your needs
                                                                                              • Databricks Academy
                                                                                                Sign in to the Databricks learning platform
                                                                                                • Certification
                                                                                                  Gain recognition and differentiation
                                                                                                  • University Alliance
                                                                                                    Want to teach Databricks? See how.
                                                                                                  • Events
                                                                                                    • Data + AI Summit
                                                                                                      • Data + AI World Tour
                                                                                                        • Data Intelligence Days
                                                                                                          • Event Calendar
                                                                                                          • Blog and Podcasts
                                                                                                            • Databricks Blog
                                                                                                              Explore news, product announcements, and more
                                                                                                              • Databricks Mosaic Research Blog
                                                                                                                Discover the latest in our Gen AI research
                                                                                                                • Data Brew Podcast
                                                                                                                  Let’s talk data!
                                                                                                                  • Champions of Data + AI Podcast
                                                                                                                    Insights from data leaders powering innovation
                                                                                                                  • Get Help
                                                                                                                    • Customer Support
                                                                                                                      • Documentation
                                                                                                                        • Community
                                                                                                                        • Dive Deep
                                                                                                                          • Resource Center
                                                                                                                            • Demo Center
                                                                                                                            • Company
                                                                                                                              • Who We Are
                                                                                                                                • Our Team
                                                                                                                                  • Databricks Ventures
                                                                                                                                    • Contact Us
                                                                                                                                    • Careers
                                                                                                                                      • Working at Databricks
                                                                                                                                        • Open Jobs
                                                                                                                                        • Press
                                                                                                                                          • Awards and Recognition
                                                                                                                                            • Newsroom
                                                                                                                                            • Security and Trust
                                                                                                                                              • Security and Trust
                                                                                                                                          • Data and AI summit

                                                                                                                                            JUNE 9–12 | SAN FRANCISCO

                                                                                                                                            Data + AI Summit is almost here — don’t miss the chance to join us in San Francisco!

                                                                                                                                            REGISTER
                                                                                                                                          • Ready to get started?
                                                                                                                                          • Get a Demo
                                                                                                                                          Data and AI summit

                                                                                                                                          JUNE 9–12 | SAN FRANCISCO

                                                                                                                                          Data + AI Summit is almost here — don’t miss the chance to join us in San Francisco!

                                                                                                                                          REGISTER
                                                                                                                                          • Login
                                                                                                                                          • Try Databricks
                                                                                                                                          1. Blog
                                                                                                                                          2. /
                                                                                                                                            Product
                                                                                                                                          3. /
                                                                                                                                            Article

                                                                                                                                          Announcing Hybrid Search General Availability in Mosaic AI Vector Search

                                                                                                                                          announcing hybrid search general availability mosaic ai vector search

                                                                                                                                          Published: August 26, 2024

                                                                                                                                          Product4 min read

                                                                                                                                          by Sergei Tsarev and Erik Lindgren

                                                                                                                                          Share this post

                                                                                                                                          Keep up with us

                                                                                                                                          We're excited to announce the general availability of hybrid search in Mosaic AI Vector Search. Hybrid search is a powerful feature that combines the strengths of pre-trained embedding models with the flexibility of keyword search. In this blog post, we'll explain why hybrid search is important, how it works, and how you can use it to improve your search results.

                                                                                                                                          Why Hybrid Search?

                                                                                                                                          Pre-trained embedding models are a powerful way to represent unstructured data, capturing semantic meaning in a compressed and easily searchable format. However it was trained using external data and doesn’t have explicit knowledge of your data. Hybrid search adds a learned keyword search index on top of your vector search index. The keyword search index is trained on your data, and thus has knowledge of the names, product keys, and other identifiers that are important for your retrieval situation.

                                                                                                                                          When to Choose Hybrid Search

                                                                                                                                          Hybrid search can perform better when there are critical keywords in your dataset that would not be present in publicly available embedding model training datasets. For example, if the question refers to specific product codes or other terms that you want to match exactly, hybrid search may be the better choice. We encourage you to try both options to see what works best for your problem set.

                                                                                                                                          Using Hybrid Search in Mosaic AI Vector Search

                                                                                                                                          It is easy to get started with hybrid search. All indices have access to hybrid search now with no additional setup required.

                                                                                                                                          The keyword index is trained on all text fields in your corpus, so it automatically has access to both the text chunk as well as all text metadata fields.

                                                                                                                                          For fully-managed Delta Sync indices you can simply add `query_type=’hybrid’` to your similarity search queries. This also works for Direct Vector Access indices with a model serving endpoint attached.

                                                                                                                                          For self-managed Delta Sync indices and Direct Vector Access indices without a model serving endpoint attach, you will need to make sure both `query_vector` and `query_text` are specified.

                                                                                                                                          Quality Improvements

                                                                                                                                          In Retrieval-Augmented Generator (RAG) applications, one critical metric is recall, the fraction of time we retrieve the chunk containing the answer to the input query in the top `num_results` retrieved chunks. We see that hybrid search is able to improve recall, and thus reduce the number of chunks needed to be processed by the LLM to answer the user’s question.

                                                                                                                                          On an internal dataset designed to represent the types of datasets we see from our customers, we see significant improvements in recall. In particular, the number of documents needed to achieve a recall of 0.9 is 50 for pure dense retrieval and 40 for hybrid search, a 20% improvement. This reduces the latency and processing cost for RAG applications.

                                                                                                                                          We include a plot below of recall at various values of the number of results retrieved. We see that hybrid search does as good or better than pure dense retrieval on all choices for the number of retrieved results.

                                                                                                                                          A graph of recall retrieving results.

                                                                                                                                          Method Used

                                                                                                                                          Our implementation of hybrid search is based on Rank Reciprocal Fusion (RRF) of the vector search and keyword search results. The parameters of RRF are tuned to values that should return high quality results for most datasets.

                                                                                                                                          Scores are normalized so the highest score possible is 1.0. This makes it easy to identify when documents are believed to be high value by both the vector searcher and keyword searcher. Scores close to 1.0 mean that both retrievers found the document to be of high relevance. Scores close to 0.5 and below mean one or both of the retrievers believe the document has low relevance.

                                                                                                                                          Next Steps

                                                                                                                                          Get started today with hybrid search! For fully-managed Delta Sync (DSYNC) indices and direct vector access indices with a model serving endpoint:

                                                                                                                                          For self-managed DSYNC indices and direct vector access indices without a model serving endpoint:

                                                                                                                                          Note that the keyword index automatically uses all text fields in your index, so these need to be provided when constructing the index.

                                                                                                                                          For more information, see our documentation on Hybrid Search:

                                                                                                                                          • Similarity Search calculation details with hybrid search
                                                                                                                                          • Python SDK for similarity_search

                                                                                                                                          Keep up with us

                                                                                                                                          Recommended for you

                                                                                                                                          Share this post

                                                                                                                                          Never miss a Databricks post

                                                                                                                                          Subscribe to the categories you care about and get the latest posts delivered to your inbox

                                                                                                                                          Sign up

                                                                                                                                          What's next?

                                                                                                                                          Mosaic AI: Build and Deploy Production-quality AI Agent Systems

                                                                                                                                          Data Science and ML

                                                                                                                                          June 12, 2024/8 min read

                                                                                                                                          Mosaic AI: Build and Deploy Production-quality AI Agent Systems

                                                                                                                                          Build Compound AI Systems Faster with Databricks Mosaic AI

                                                                                                                                          Data Science and ML

                                                                                                                                          October 1, 2024/5 min read

                                                                                                                                          Build Compound AI Systems Faster with Databricks Mosaic AI

                                                                                                                                          databricks logo
                                                                                                                                          Why Databricks
                                                                                                                                          Discover
                                                                                                                                          • For Executives
                                                                                                                                          • For Startups
                                                                                                                                          • Lakehouse Architecture
                                                                                                                                          • Mosaic Research
                                                                                                                                          Customers
                                                                                                                                          • Customer Stories
                                                                                                                                          Partners
                                                                                                                                          • Cloud Providers
                                                                                                                                          • Technology Partners
                                                                                                                                          • Data Partners
                                                                                                                                          • Built on Databricks
                                                                                                                                          • Consulting & System Integrators
                                                                                                                                          • C&SI Partner Program
                                                                                                                                          • Partner Solutions
                                                                                                                                          Discover
                                                                                                                                          • For Executives
                                                                                                                                          • For Startups
                                                                                                                                          • Lakehouse Architecture
                                                                                                                                          • Mosaic Research
                                                                                                                                          Customers
                                                                                                                                          • Customer Stories
                                                                                                                                          Partners
                                                                                                                                          • Cloud Providers
                                                                                                                                          • Technology Partners
                                                                                                                                          • Data Partners
                                                                                                                                          • Built on Databricks
                                                                                                                                          • Consulting & System Integrators
                                                                                                                                          • C&SI Partner Program
                                                                                                                                          • Partner Solutions
                                                                                                                                          Product
                                                                                                                                          Databricks Platform
                                                                                                                                          • Platform Overview
                                                                                                                                          • Sharing
                                                                                                                                          • Governance
                                                                                                                                          • Artificial Intelligence
                                                                                                                                          • Business Intelligence
                                                                                                                                          • Data Management
                                                                                                                                          • Data Warehousing
                                                                                                                                          • Real-Time Analytics
                                                                                                                                          • Data Engineering
                                                                                                                                          • Data Science
                                                                                                                                          Pricing
                                                                                                                                          • Pricing Overview
                                                                                                                                          • Pricing Calculator
                                                                                                                                          Open Source
                                                                                                                                          Integrations and Data
                                                                                                                                          • Marketplace
                                                                                                                                          • IDE Integrations
                                                                                                                                          • Partner Connect
                                                                                                                                          Databricks Platform
                                                                                                                                          • Platform Overview
                                                                                                                                          • Sharing
                                                                                                                                          • Governance
                                                                                                                                          • Artificial Intelligence
                                                                                                                                          • Business Intelligence
                                                                                                                                          • Data Management
                                                                                                                                          • Data Warehousing
                                                                                                                                          • Real-Time Analytics
                                                                                                                                          • Data Engineering
                                                                                                                                          • Data Science
                                                                                                                                          Pricing
                                                                                                                                          • Pricing Overview
                                                                                                                                          • Pricing Calculator
                                                                                                                                          Integrations and Data
                                                                                                                                          • Marketplace
                                                                                                                                          • IDE Integrations
                                                                                                                                          • Partner Connect
                                                                                                                                          Solutions
                                                                                                                                          Databricks For Industries
                                                                                                                                          • Communications
                                                                                                                                          • Financial Services
                                                                                                                                          • Healthcare and Life Sciences
                                                                                                                                          • Manufacturing
                                                                                                                                          • Media and Entertainment
                                                                                                                                          • Public Sector
                                                                                                                                          • Retail
                                                                                                                                          • View All
                                                                                                                                          Cross Industry Solutions
                                                                                                                                          • Cybersecurity
                                                                                                                                          • Marketing
                                                                                                                                          Data Migration
                                                                                                                                          Professional Services
                                                                                                                                          Solution Accelerators
                                                                                                                                          Databricks For Industries
                                                                                                                                          • Communications
                                                                                                                                          • Financial Services
                                                                                                                                          • Healthcare and Life Sciences
                                                                                                                                          • Manufacturing
                                                                                                                                          • Media and Entertainment
                                                                                                                                          • Public Sector
                                                                                                                                          • Retail
                                                                                                                                          • View All
                                                                                                                                          Cross Industry Solutions
                                                                                                                                          • Cybersecurity
                                                                                                                                          • Marketing
                                                                                                                                          Resources
                                                                                                                                          Documentation
                                                                                                                                          Customer Support
                                                                                                                                          Community
                                                                                                                                          Training and Certification
                                                                                                                                          • Learning Overview
                                                                                                                                          • Training Overview
                                                                                                                                          • Certification
                                                                                                                                          • University Alliance
                                                                                                                                          • Databricks Academy Login
                                                                                                                                          Events
                                                                                                                                          • Data + AI Summit
                                                                                                                                          • Data + AI World Tour
                                                                                                                                          • Data Intelligence Days
                                                                                                                                          • Event Calendar
                                                                                                                                          Blog and Podcasts
                                                                                                                                          • Databricks Blog
                                                                                                                                          • Databricks Mosaic Research Blog
                                                                                                                                          • Data Brew Podcast
                                                                                                                                          • Champions of Data & AI Podcast
                                                                                                                                          Training and Certification
                                                                                                                                          • Learning Overview
                                                                                                                                          • Training Overview
                                                                                                                                          • Certification
                                                                                                                                          • University Alliance
                                                                                                                                          • Databricks Academy Login
                                                                                                                                          Events
                                                                                                                                          • Data + AI Summit
                                                                                                                                          • Data + AI World Tour
                                                                                                                                          • Data Intelligence Days
                                                                                                                                          • Event Calendar
                                                                                                                                          Blog and Podcasts
                                                                                                                                          • Databricks Blog
                                                                                                                                          • Databricks Mosaic Research Blog
                                                                                                                                          • Data Brew Podcast
                                                                                                                                          • Champions of Data & AI Podcast
                                                                                                                                          About
                                                                                                                                          Company
                                                                                                                                          • Who We Are
                                                                                                                                          • Our Team
                                                                                                                                          • Databricks Ventures
                                                                                                                                          • Contact Us
                                                                                                                                          Careers
                                                                                                                                          • Open Jobs
                                                                                                                                          • Working at Databricks
                                                                                                                                          Press
                                                                                                                                          • Awards and Recognition
                                                                                                                                          • Newsroom
                                                                                                                                          Security and Trust
                                                                                                                                          Company
                                                                                                                                          • Who We Are
                                                                                                                                          • Our Team
                                                                                                                                          • Databricks Ventures
                                                                                                                                          • Contact Us
                                                                                                                                          Careers
                                                                                                                                          • Open Jobs
                                                                                                                                          • Working at Databricks
                                                                                                                                          Press
                                                                                                                                          • Awards and Recognition
                                                                                                                                          • Newsroom
                                                                                                                                          databricks logo

                                                                                                                                          Databricks Inc.
                                                                                                                                          160 Spear Street, 15th Floor
                                                                                                                                          San Francisco, CA 94105
                                                                                                                                          1-866-330-0121

                                                                                                                                          See Careers
                                                                                                                                          at Databricks

                                                                                                                                          © Databricks 2025. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the Apache Software Foundation.

                                                                                                                                          • Privacy Notice
                                                                                                                                          • |Terms of Use
                                                                                                                                          • |Modern Slavery Statement
                                                                                                                                          • |California Privacy
                                                                                                                                          • |Your Privacy Choices