Toolkits
Toolkits are collections of related tools that an agent can use to perform actions. They are the primary way to extend an agent's capabilities.
AsyncBaseToolkit
All toolkits inherit from the AsyncBaseToolkit
abstract base class. This class provides a standardized interface for creating and managing tools. The core requirement for any toolkit is to implement the get_tools_map()
method, which returns a dictionary mapping tool names to their corresponding Python functions.
The base class automatically handles the conversion of these functions into FunctionTool
objects that the agent runner can understand and execute.
All available toolkits are registered in the TOOLKIT_MAP
dictionary within utu/tools/__init__.py
.
Summary of Core Toolkits
Here is a summary of some key toolkits available in the framework:
Toolkit Class | Provided Tools (Functions) | Core Functionality & Mechanism |
---|---|---|
SearchToolkit | search_google_api , web_qa |
Performs web searches using the Serper API and reads webpage content using the Jina API. It can use an LLM to answer questions based on page content. |
DocumentToolkit | document_qa |
Processes local or remote documents (PDF, DOCX, etc.). It uses the chunkr.ai service to parse the document and an LLM to answer questions or provide a summary. |
PythonExecutorToolkit | execute_python_code |
Executes Python code snippets in an isolated environment using IPython.core.interactiveshell . It runs in a separate thread to prevent blocking and can capture outputs, errors, and even matplotlib plots. |
BashToolkit | run_bash |
Provides a persistent local shell session using the pexpect library. This allows the agent to run a series of commands that maintain state (e.g., current directory). |
ImageToolkit | image_qa |
Answers questions about an image or provides a detailed description. It uses a vision-capable LLM to analyze the image content. |
AudioToolkit | audio_qa |
Transcribes audio files using an audio model and then uses an LLM to answer questions based on the transcription. |
CodesnipToolkit | run_code |
Executes code in various languages (Python, C++, JS, etc.) by sending it to a remote sandbox service (like SandboxFusion) and returning the result. |