Elasticsearch with Azure OpenAI: A Comprehensive Guide

The Tech Platform
Apr 23, 2024
7 min read

Azure OpenAI is a powerful AI service offered by Microsoft, providing developers with the tools they need to build intelligent applications. It leverages advanced AI models, such as GPT-4, to understand and generate human-like text, making it a valuable tool for applications, from chatbots to content generation.

One of the key features of Azure OpenAI is its integration with Elasticsearch. Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real-time. By integrating Elasticsearch with Azure OpenAI, developers can leverage the power of both technologies to create more efficient and effective AI applications.

Understanding Elasticsearch

Elasticsearch is a distributed, RESTful search and analytics engine, designed to handle different data types and use cases. It is built on Apache Lucene, a powerful open-source search library, and provides a simple, coherent API for storing, searching, and analyzing data.

In the context of Azure OpenAI, Elasticsearch is used as a vector database. A vector database stores data in a way that is optimized for high-dimensional vector space – a necessity for many machine learning algorithms. This allows Azure OpenAI to perform complex queries and operations on the data, enhancing the capabilities of the AI models.

Elasticsearch in Azure OpenAI

Elasticsearch is used in Azure OpenAI as a vector database for the “On Your Data” feature. This feature allows you to use OpenAI models, such as GPT-4, that incorporate the advanced capabilities of the Retrieval Augmented Generation (RAG) model, directly on your data with enterprise-grade security on Azure.

Elasticsearch integration with Azure OpenAI enables developers to enrich Large Language Models (LLMs) with precise, contextual information from their business data, ensuring natural, informed, and accurate conversation.

How Elasticsearch works with Azure OpenAI

Consider the below image that depicts how Elasticsearch works with Azure OpenAI, a workflow for an AI-powered search experience that leverages both these technologies.

Working of Elasticsearch with Azure OpenAI — Elastic Search Labs

1. User initiates search: The process begins with a user submitting a search query. This query can be phrased as a question concerning a particular domain's private content.

2. Data retrieval: Elasticsearch, a search engine known for its speed and scalability, facilitates the retrieval of relevant data based on the user's query. This data is likely secured in a private data repository.

3. Processing with Azure OpenAI: The retrieved data, along with the initial user query, are combined and fed into a Context Window. This Context Window provides Azure OpenAI, a large language model service, with the necessary information to comprehend the user's intent. Azure OpenAI then generates a response relevant to the retrieved data and addresses the user's question.

4. Delivering the response: The application delivers the generated response to the user.

Benefits of Elasticsearch with Azure OpenAI

There are several benefits of using Elasticsearch with Azure OpenAI:

Integration: You can integrate any Elasticsearch index into your conversational AI, enabling rapid deployment of chat experiences across a wide range of applications.
Data Source: You can bring your existing Elasticsearch indexes to “On Your Data”—whether those indexes live on Azure or on-prem.
Search Capabilities: The integration brings the precision of BM25 (text) search, the semantic understanding of vector search, and the best of both worlds with hybrid search.
Security: Document and field-level security are provided, so users can only access the information they’re entitled to based on their permissions.

These benefits make Elasticsearch a powerful tool when used in conjunction with Azure OpenAI, enhancing the capabilities of the AI models and providing a more efficient and effective AI application.

Setting up Elasticsearch in Azure OpenAI

Before setting up Elasticsearch in Azure OpenAI, make sure that your Azure account has owner access to the subscription in which you want to deploy Elasticsearch.

STEP 1: Open Azure account. Search "Elasticsearch" in the search bar and click the "Elastic Cloud (Elasticsearch) - An Azure Native ISV Service" option from the list.

STEP 2: Click "+ Create" to create a new Elasticsearch resource.

Click Create to create Elasticsearch resource in azure openai

STEP 3: In the Basic section, enter the following information

Subscription: Choose the Azure subscription you want to use for the Elasticsearch resource.
Resource Group: Select an existing resource group or create a new one.
Resource Name: Enter a unique name for your Elasticsearch resource.
Region: Add the selected region where it is available.

Create Elasticsearch resource in Azure OpenAI

You can also choose the plan and select the billing option.

Click "Next: Logs & metrics"

STEP 4: In the Logs & metrics section, enter the logs and metric details:

Logs

Check the box labeled “Send subscription activity logs”. This will enable Azure to send logs of all activities occurring in Azure subscriptions to Elasticsearch.
Check the box labeled “Send Azure resource logs for all defined resources”. This will enable Azure to send logs of all activities related to the resources defined in your subscription to Elasticsearch.
Limit Logs Collection: You can limit the logs collection by adding actions, names, and values to include or exclude Azure resources with specific tags. This can be useful if you only want to log activities related to certain resources.

Metrics

You can configure Azure to send metrics of Azure services to Elasticsearch. Metrics provide insights into the performance and health of your resources, and sending them to Elasticsearch allows you to analyze these metrics in real-time.

Click "Next: Tags".

STEP 5: In the “Tags” tab, you can add tags to your resource. Tags are name/value pairs that enable you to categorize resources and view consolidated billing by applying the same tag to multiple resources.

Click "Next: Review+create".

STEP 6: Review all the information you’ve entered. Make sure everything is correct. Once you’ve confirmed all the details, click the “Create” button.

Review all the information and click create to create Elasticsearch in Azure OpenAI

Your Elasticsearch resource will now be created. It may take a few minutes for the resource to be deployed. Once the deployment is complete, you’ll see the resource you created displayed on your Azure dashboard.

Integrating ElasticSearch with Azure OpenAI Service

STEP 1: Open Azure OpenAI and navigate to Azure OpenAI Studio.

STEP 2: On the left panel of Azure OpenAI Studio, click on the "Chat" option.

Integrate Elasticsearch with Azure OpenAI service

STEP 3: Chat configuration panel will appear. In the right panel, navigate to the "Add your data" section and click the "+ Add your data" option.

STEP 4: Under the "Select data source" option, select Elasticsearch(preview) from the dropdown menu.

Enter the following credentials:

Elasticsearch Endpoints: This is the URL of your Elasticsearch cluster. It’s the address that Azure OpenAI will use to connect to your Elasticsearch instance. You should enter the full URL, including the http:// or https:// prefix.
Encode API Key: This is the security key that Azure OpenAI will use to authenticate with your Elasticsearch instance. You should enter the API key exactly as it was provided to you by Elasticsearch. Please note that this key should be kept secure, as anyone with access to it could potentially access your Elasticsearch data.
Elasticsearch Index: This is the name of the index in your Elasticsearch instance that you want Azure OpenAI to use. An index in Elasticsearch is like a database in a traditional relational database system. You should enter the name of the index exactly as it appears in Elasticsearch.

Click "Next"

STEP 5: In the Data Management section, configure how your data is managed and used by Azure OpenAI. Click "Next"

STEP 6: Review all the details. Make sure that everything is correct as per your requirements. Click "Create".

Once you’ve entered these details, you’re all set! Azure OpenAI can now use your Elasticsearch instance as a data source.

ElasticSearch Parameters

Here are some of the parameters used for Elasticsearch in Azure OpenAI:

endpoint: The absolute endpoint path for the Elasticsearch resource to use.
index_name: The name of the index to use in the referenced Elasticsearch.
authentication: The authentication method to use when accessing the defined data source.
embedding_dependency: The embedding dependency for vector search. Required when query_type is vector.
fields_mapping: Customized field mapping behavior when interacting with the search index.
in_scope: Whether queries should be restricted to the use of indexed data. The default is True.
query_type: The query type to use with Elasticsearch. The default is simple.
role_information: Give the model instructions about how it should behave and any context it should reference when generating a response.
strictness: The configured strictness of the search relevance filtering. The higher the strictness, the higher the precision but the lower the recall of the answer. The default is 3.
top_n_documents: The configured top number of documents to feature for the configured query. The default is 5.

Elasticsearch Use Cases in Azure OpenAI

Elasticsearch offers several advantages when combined with Azure OpenAI:

Enhanced Search Capabilities:

Internal Knowledge Search: Enable this to Improve employee efficiency in finding relevant information within the organization.
User Self-Service: Empower customers to find answers to questions independently through self-service portals powered by Elasticsearch.
AI-Powered Search Applications: Developers can leverage Elasticsearch and Azure OpenAI to create intelligent search experiences that understand user intent and deliver highly relevant results.

Text Processing and Vectorization:

Chatbots: Develop chatbots that can handle common workflows using Elasticsearch for data retrieval and Azure OpenAI for natural language processing (NLP ).
Transforming Text into Vectors: Elasticsearch integrates seamlessly with OpenAI's APIs to convert textual data into vectors, a format ideal for machine learning tasks like similarity search.

Case Studies of Organizations Using Elasticsearch with Azure OpenAI

Young Williams: This organization utilized an AI assistant named Priya, powered by Elasticsearch and Azure OpenAI, to streamline SNAP assistance. Priya provides timely and accurate information, enhancing both customer experience and operational efficiency.
UK Ministry of Defence (DE&S): DE&S leveraged this combination to empower 260,000 employees with easy access to ever-changing cybersecurity protocols and policy information. Additionally, the security team benefits from efficient tools to answer employee questions.
Relativity: This eDiscovery leader utilizes Elasticsearch and Azure OpenAI to build innovative search experiences for legal and discovery purposes.

Conclusion

Elasticsearch in Azure OpenAI opens up a world of possibilities for developers and organizations alike. It provides a powerful, scalable, and efficient solution for storing, searching, and analyzing large volumes of data in near real-time.

From powering chatbots and user self-service portals to enabling internal knowledge search and building AI-powered search applications, the use cases of Elasticsearch in Azure OpenAI are vast and varied. Organizations across various sectors have already started harnessing the power of this integration to revolutionize their operations and enhance customer experience.