top of page

Introduction to Azure OpenAI Assistants

Updated: Feb 21

In the rapidly evolving world of technology, artificial intelligence (AI) has become a crucial player, transforming numerous industries and reshaping the way we interact with the digital world. One such innovation that stands at the forefront of this revolution is the Azure OpenAI Assistants.

In this article, we will learn about Azure OpenAI Assistants, exploring its key features, and capabilities, and how it’s paving the way for a new era of AI-powered interactions.

Azure OpenAI Assistants
Azure OpenAI Assistants

Table of Contents:

Azure OpenAI Assistants

Azure OpenAI Assistants are a feature of Azure OpenAI Service that allows you to create AI assistants tailored to your specific needs. These assistants can be customized through instructions and are augmented by advanced tools like a code interpreter and custom functions. They provide sophisticated, copilot-like experiences that can sift through data, suggest solutions, and automate tasks.

Azure OpenAI Assistants components




Custom AI that uses Azure OpenAI models in conjunction with tools.


A conversation session between an Assistant and a user. Threads store Messages and automatically handle truncation to fit content into a model’s context.


A message created by an Assistant or a user. Messages can include text, images, and other files. Messages are stored as a list on the Thread.


Activation of an Assistant to begin running based on the Thread's contents. The Assistant uses its configuration and the Thread’s Messages to perform tasks by calling models and tools. As part of a Run, the Assistant appends Messages to the Thread.

Run Step

A detailed list of steps the Assistant took as part of a Run. An Assistant can call tools or create Messages during it’s run. Examining Run Steps lets you understand how the Assistant is getting to its final results.

Persistent and Infinitely Long Threads:

Azure OpenAI Assistants support persistent and infinitely long threads. This means you can have ongoing conversations with the Assistant without worrying about losing context or history.

This feature is particularly useful for complex tasks that require multiple steps or for interactions that span over a long period.

For example, you could start a conversation with the Assistant, leave it midway, and then come back to it after some time. The Assistant would still remember the context and be able to continue the conversation from where you left off.

Supported Models and File Types

Azure OpenAI Assistants support a variety of models, each with different capabilities. The models include

GPT-4 and GPT-3.5 models can understand and generate natural language and code. The most capable and cost-effective model in the GPT-3.5 family is GPT-3.5 Turbo, which has been optimized for chat and works well for traditional completion tasks.

The models page contains the most up-to-date information on regions/models where Assistants are currently supported. For Assistants, you need a combination of a supported model and a supported region.

Azure OpenAI Assistants is currently available in the following regions:


gpt-35-turbo (0613)

gpt-35-turbo (1106)

gpt-4 (0613)

gpt-4 (1106)

Australia East





East US 2





Sweden Central





Supported File types for Code Interpreter

File Format





















































application/xml or text/xml

Files can be uploaded via Studio, or programmatically. The file_ids parameter is required to give tools like code_interpreter access to files. When using the File upload endpoint, you must have the purpose set to assistants to be used with the Assistants AP.

Tools and Custom Functions in Azure OpenAI Assistants

An OpenAI assistant can access up to 128 tools. Azure OpenAI Assistants provide a variety of tools to enhance their capabilities:

  1. Code Interpreter: This is an OpenAI-hosted tool that allows the assistant to write and run Python code in a sandboxed execution environment. With Code Interpreter enabled, your assistant can run code iteratively to solve more challenging code, math, and data analysis problems. It’s important to note that Code Interpreter has additional charges beyond the token-based fees for Azure OpenAI usage.

  2. Retrieval: This tool is used to retrieve information from a database or a set of documents. However, as of now, there seems to be an issue with the retrieval tool as it is not supported.

  3. Third-Party Tools: In addition to OpenAI-hosted tools, Azure OpenAI Assistants can also call third-party tools via a function. This allows you to extend the capabilities of the assistant beyond the provided tools

Custom Functions:

Custom functions in Azure OpenAI Assistants are a powerful feature that allows you to extend the capabilities of your assistant.

You can define your custom tools via functions. These custom function definitions allow the models to formulate API calls and structure data outputs based on your specifications.

This means you can create functions that perform specific tasks, such as retrieving data from an API, performing calculations, or interacting with a database.

The latest versions of gpt-35-turbo and gpt-4 are fine-tuned to work with functions and can both determine when and how a function should be called.

If one or more functions are included in your request, the model determines if any of the functions should be called based on the context of the prompt.

When the model determines that a function should be called, it responds with a JSON object including the arguments for the function.

Executing Functions:

While the models can generate these calls, it’s up to you to execute them, ensuring you remain in control. This process can be broken down into three steps:

  1. Call the chat completions API with your functions and the user’s input.

  2. Use the model’s response to call your API or function.

  3. Call the chat completions API again, including the response from your function to get a final response.

Parallel Function Calling:

Parallel function calls are supported, allowing you to perform multiple function calls together, allowing for parallel execution and retrieval of results. This reduces the number of calls to the API that need to be made and can improve overall performance.

For example, for a simple weather app, you may want to retrieve the weather in multiple locations at the same time. This will result in a chat completion message with three function calls in the tool_calls array, each with a unique id.

If you wanted to respond to these function calls, you would add 3 new messages to the conversation, each containing the result of one function call, with a tool_call_id referencing the id from tools_calls.

Supported Models

Parallel function calling is supported on the following models:

  1. gpt-4-turbo-preview

  2. gpt-4-0125-preview

  3. gpt-4-1106-preview

  4. gpt-3.5-turbo-0125

  5. gpt-3.5-turbo-1106.


Azure OpenAI Assistants represents a significant leap forward in the field of artificial intelligence. Its ability to provide custom instructions, utilize advanced tools, define custom functions, and support persistent and infinitely long threads makes it a versatile and powerful tool for developers and businesses alike.

Remember, Azure OpenAI Assistants are currently in preview and the capabilities and features may change over time.


bottom of page