Skip to content

Agent

Agent

Bases: ABC

Abstract base class for creating intelligent agents that interact with platforms.

This class serves as a blueprint for implementing various types of agents (e.g., chatbots, code assistants, content generators) that can process prompts and generate responses using different AI models.

Attributes:

Name Type Description
platform Platform

The platform instance this agent operates on

model str

The AI model identifier (e.g., 'gpt-4', 'llama3') used for generating responses

response_model Optional[BaseModel]

Pydantic model defining the structure of expected responses, or None if free-form text responses are acceptable

system_prompt Optional[str]

System-level instructions that define the agent's behavior and role

Example

class ChatAgent(Agent): def run(self, prompt: str, media: Optional[List[Media]] = None, **kwargs) -> Any: # Implementation here pass

run(prompt, media=None, **kwargs) abstractmethod

Execute the agent with a given prompt and optional media.

This abstract method must be implemented by subclasses to define how the agent processes input and generates responses.

Parameters:

Name Type Description Default
prompt str

The user's input or instruction for the agent

required
media Optional[List[Media]]

List of media objects (images, files) to process, or None if no media is provided

None
**kwargs Any

Additional keyword arguments for flexible input handling

{}

Returns:

Name Type Description
Any Any

The result of the agent's processing, typically a response or generated content

Raises:

Type Description
NotImplementedError

If subclass doesn't implement this method

InstructAgent

Bases: Agent

A specialized agent that executes text-to-text transformations using a configured platform.

This class extends the base Agent to provide structured execution of text processing tasks through an underlying platform's text-to-text functionality. It supports optional system prompt configuration and integration of media content during processing. class InstructAgent

The class provides a simplified interface for invoking text generation or transformation capabilities by abstracting the underlying platform interaction. It ensures proper merging of system prompts and additional arguments, prioritizing explicitly provided system prompts over those passed through kwargs.

The run method handles the core execution, delegating to the platform's text2text capability while managing required parameters like model selection, input prompt, optional media, and structured response formatting through response_model.

run(prompt, media=None, **kwargs)

Execute a text-to-text transformation using the configured platform.

This method processes a prompt through the platform's text2text capability, optionally incorporating a system prompt if one was provided during initialization.

Parameters:

Name Type Description Default
prompt str

The input text prompt to process

required
media Optional[List[Media]]

Optional list of media objects to include in the processing (e.g., images, files)

None
**kwargs Any

Additional keyword arguments to pass to the platform's text2text method

{}

Returns:

Name Type Description
Any Any

The result from the platform's text2text processing, typically a string or structured data based on response_model parameter

Note

If system_prompt was provided during initialization, it will be merged with any additional kwargs, where the system prompt takes precedence in case of key conflicts.

ImageCaptionerAgent

Bases: Agent

Agent responsible for generating image captions using a specified platform and model.

This class extends the base Agent to provide specialized functionality for image captioning tasks. It leverages an underlying platform's image-to-text capability to generate descriptive captions for images based on a given prompt. The class abstracts the complexity of directly interfacing with the platform and provides a clean interface for executing captioning operations.

Class Attributes

None

Methods:

Name Description
run

Executes the image captioning process using the configured platform and model.

run(prompt, media=None, **kwargs)

Execute the image captioning process.

Parameters:

Name Type Description Default
prompt str

The prompt or instruction for image description generation.

required
media Optional[List[Media]]

List of media objects to process. Defaults to None.

None
**kwargs Any

Additional keyword arguments passed to the platform's image2text method.

{}

Returns:

Name Type Description
Any Any

The result from the platform's image-to-text conversion, typically a string or structured data containing the generated caption or description.

ImageGeneratorAgent

Bases: Agent

Generate images based on textual prompts or modify existing images using prompting.

This class extends the base Agent to specialize in image generation tasks. It supports two primary workflows: generating an image from a text prompt alone, or transforming one or more input images based on a combined text prompt and media input. The actual image generation is delegated to the underlying platform implementation through dispatch to either text2image or image2image methods.

When no media is provided, the agent generates a new image from scratch using only the prompt. When one or more media objects are provided, the agent uses them as input to guide the image transformation process, delegating to an image-to-image pipeline instead.

run(prompt, media=None, response_model=None, **kwargs)

Execute the image captioning process.

Parameters:

Name Type Description Default
prompt str

The prompt or instruction for image description generation.

required
media Optional[List[Media]]

List of media objects to process. Defaults to None.

None
**kwargs Any

Additional keyword arguments passed to the platform's image2text method.

{}

Returns:

Name Type Description
Any Any

The result from the platform's image-to-text conversion, typically a string or structured data containing the generated caption or description.