Function Calling with Voice Agents

Learn some tips and strategies for using Function Calling with your Voice Agent.

What is Function Calling?

LLM function calling refers to the ability of large language models (LLMs) to invoke external functions or APIs in response to user queries, rather than simply generating text based on their training data. This capability allows LLMs to enhance their functionality by integrating with other systems, services, or databases to provide real-time data, perform specific tasks, or trigger actions.

How Function Calling Works

  • User Query: A user asks the LLM something that requires external data or specific action (e.g., "Check the weather in New York" or "Book an appointment").
  • Function Identification: The LLM identifies that the query requires a specific function to be called. For instance, if the user asks for the weather, the model recognizes that it needs to call a weather API rather than generate a general response.
  • Parameter Extraction: The LLM analyzes the user's query to extract the required parameters (e.g., location, date, or other variables). For example, in the weather query, "New York" would be extracted as the location parameter.
  • Call the Function: The LLM triggers an external function or API with the appropriate parameters. This could involve fetching live data, performing a task (e.g., making a booking), or retrieving information that is outside the LLM’s static knowledge.
  • Return the Result: The function returns the result (such as the current weather data), which the LLM incorporates into its response back to the user.
Function Calling Flow Diagram

A process flow of the Voice Agent API with function calling.


Here's a step-by-step breakdown of what the diagram illustrates:

1. Your application calls the API

Your application sends a prompt to the LLM, along with definitions of the available functions that the LLM can call. These functions can represent specific operations or tasks that the LLM might trigger.

2. The model makes a decision

The LLM analyzes the input and determines whether to respond directly to the user or call one or more functions. This decision depends on the content of the prompt and whether the LLM deems a function execution necessary to fulfill the user's request.

3. API response with function specification
The API responds to your application by specifying which function (or functions) should be called and what arguments should be passed to it. This is the step where the LLM returns a clear action for your application to take.

4. Your application executes the function

After receiving the function and arguments from the API, your application executes the function with the specified arguments, completing the operation.

5. Your application sends the result back to the API

Once the function has been executed, your application sends the result (along with the original prompt) back to the API. This step provides the final output that can either be returned to the user or used for further operations.

Code Samples

📘

For more ideas and code samples for using Function Calling with your Agent check out this repository.