How to build AI Agents with aiXplain

Learn how to build AI agents easily with this step-by-step guide. This article walks you through the simple process of designing and building agents that can search Wikipedia, extract relevant information, and provide answers in both text and audio formats. Perfect for beginners and experienced developers alike, it offers practical insights into building powerful, user-friendly AI solutions. Thiago Castro Ferreira, an applied scientist at aiXplain, demonstrates these techniques on the aiXplain platform, showcasing its advanced capabilities for building advanced AI tools.

Check this Google Colab

Introduction to the Demonstration

Thiago’s video aimed to showcase the process of creating AI agents using the aiXplain SDK. The aiXplain platform offers a comprehensive marketplace of models from various providers, allowing users to integrate and utilize these models seamlessly to build sophisticated AI solutions.

Step-by-Step Rundown

1. Installing the aiXplain SDK

To begin, Thiago installed the aiXplain Python SDK, essential for creating and managing agents and building a wide range of AI solutions using the models available on the aiXplain platform. Here’s how to install it:

!pip install aixplain

2. Setting Up Your API Key

To access the models and tools available on the aiXplain platform, it’s crucial to set up the team API key as an environment variable. This key allows users to authenticate their requests and interact with the platform’s resources. Here’s how to find your API key:

Navigate to the aiXplain Studio
Click on <your name> in the top right corner
Go to Account → General settings → Integrations
Copy your team API key or create a new one if needed

Once you have your API key, set it as an environment variable in your Colab notebook.

import os
os.environ["TEAM_API_KEY"] = "your_team_api_key_here"

3. Creating Model Tools

Using aiXplain, users can create agents equipped with various tools. Thiago created tools for speech synthesis, translation, and named entity recognition:

from aixplain.factories import AgentFactory
from aixplain.modules.agent import ModelTool
from aixplain.enums import Function, Supplier

speech_synthesis_tool = ModelTool(
    function=Function.SPEECH_SYNTHESIS,
    supplier=Supplier.GOOGLE
)

translation_tool = ModelTool(
    function=Function.TRANSLATION,
    supplier=Supplier.MICROSOFT
)

ner_tool = ModelTool(
    function=Function.NAMED_ENTITY_RECOGNITION,
    supplier=Supplier.MICROSOFT
)

4. Creating Pipeline Tools

He also created a pipeline tool to extract personal information from Wikipedia:

from aixplain.modules.agent import PipelineTool

pipeline_tool = PipelineTool(
    description="Personal Information Extractor given a figure person name",
    pipeline="66b1157b2af845462188be53"
)

5. Creating an AI Agent

After configuring the tools, Thiago created an agent named “Wiki agent” and set it up to use GPT-4o for text generation:

from aixplain.factories import AgentFactory

agent = AgentFactory.create(
    name="Wiki-Agent1",
    tools=[
        speech_synthesis_tool,
        ner_tool,
        translation_tool,
        pipeline_tool
    ],
    description="Using Wikipedia to answer questions",
    llm_id="6646261c6eb563165658bbb1" # GPT 4o
)
agent.id

6. Getting the Agent

To retrieve and interact with the created agent:

agent = AgentFactory.get(agent.id)

7. Invoking Agents

Thiago demonstrated how to run the agent with a sample question and get the response with an audio link:

agent_response1 = agent.run("What is the name of the driver who won Formula One championship in 2023? Answer in an English audio")
print(agent_response1)

agent_response1["data"]

To play the audio response:

import requests
import re
from IPython.display import Audio, display

def display_audio(agent_response):
  pattern = r"https://[^\s/$.?#].[^\s]*"
  sound_file = re.findall(pattern, agent_response["data"]["output"])[0].replace(").", "").replace(")","")
  print(sound_file)
  response = requests.get(sound_file)

  if response.status_code == 200:
      with open('downloaded_file.mp3', 'wb') as file:
          file.write(response.content)
  display(Audio('downloaded_file.mp3', autoplay=True))
  os.remove('downloaded_file.mp3')

display_audio(agent_response1)

8. Invoking Agent with Short-term Memory

Thiago showed how to use the session ID to ask follow-up questions within an active session:

session_id = agent_response1["data"]["session_id"]
print(f"Session id: {session_id}")

agent_response2 = agent.run("Extract the personal information about that driver.", session_id=session_id)
print("\nResponse:")
print(agent_response2)

agent_response2["data"]["output"]

agent_response3 = agent.run("What about in 1991? I want the answer in German text this time", session_id=session_id)
print("\nResponse:")
print(agent_response3)

agent_response4 = agent.run("get me his personal info", session_id=session_id)
print("\nResponse:")
print(agent_response4)

response = agent.run("Give me the personal info of the Brazilian athlete who most won Olympic medals for their country.")
print(response)

response["data"]["output"]

9. Deleting the Agent

For agents created for testing or temporary purposes, the aiXplain platform offers an easy way to delete them:

agent.delete()

Conclusion

The aiXplain platform offers a powerful and versatile environment for creating and utilizing AI agents. Whether you’re developing simple chatbots or complex data retrieval systems, aiXplain provides the necessary tools and models to support your needs. Thiago’s demonstration highlighted the ease and efficiency of using aiXplain for AI development, inspiring users to explore the platform further.

Stay tuned for more tutorials and insights into the exciting world of AI with aiXplain!