Agent
Agent
¶
Bases: ABC
Abstract base class for creating intelligent agents that interact with platforms.
This class serves as a blueprint for implementing various types of agents (e.g., chatbots, code assistants, content generators) that can process prompts and generate responses using different AI models.
Attributes:
| Name | Type | Description |
|---|---|---|
platform |
Platform
|
The platform instance this agent operates on |
model |
str
|
The AI model identifier (e.g., 'gpt-4', 'llama3') used for generating responses |
response_model |
Optional[BaseModel]
|
Pydantic model defining the structure of expected responses, or None if free-form text responses are acceptable |
system_prompt |
Optional[str]
|
System-level instructions that define the agent's behavior and role |
Example
class ChatAgent(Agent): def run(self, prompt: str, media: Optional[List[Media]] = None, **kwargs) -> Any: # Implementation here pass
run(prompt, media=None, **kwargs)
abstractmethod
¶
Execute the agent with a given prompt and optional media.
This abstract method must be implemented by subclasses to define how the agent processes input and generates responses.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The user's input or instruction for the agent |
required |
media
|
Optional[List[Media]]
|
List of media objects (images, files) to process, or None if no media is provided |
None
|
**kwargs
|
Any
|
Additional keyword arguments for flexible input handling |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result of the agent's processing, typically a response or generated content |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If subclass doesn't implement this method |
InstructAgent
¶
Bases: Agent
A specialized agent that executes text-to-text transformations using a configured platform.
This class extends the base Agent to provide structured execution of text processing tasks through an underlying platform's text-to-text functionality. It supports optional system prompt configuration and integration of media content during processing. class InstructAgent
The class provides a simplified interface for invoking text generation or transformation capabilities by abstracting the underlying platform interaction. It ensures proper merging of system prompts and additional arguments, prioritizing explicitly provided system prompts over those passed through kwargs.
The run method handles the core execution, delegating to the platform's text2text capability while managing required parameters like model selection, input prompt, optional media, and structured response formatting through response_model.
run(prompt, media=None, **kwargs)
¶
Execute a text-to-text transformation using the configured platform.
This method processes a prompt through the platform's text2text capability, optionally incorporating a system prompt if one was provided during initialization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The input text prompt to process |
required |
media
|
Optional[List[Media]]
|
Optional list of media objects to include in the processing (e.g., images, files) |
None
|
**kwargs
|
Any
|
Additional keyword arguments to pass to the platform's text2text method |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result from the platform's text2text processing, typically a string or structured data based on response_model parameter |
Note
If system_prompt was provided during initialization, it will be merged with any additional kwargs, where the system prompt takes precedence in case of key conflicts.
ImageCaptionerAgent
¶
Bases: Agent
Agent responsible for generating image captions using a specified platform and model.
This class extends the base Agent to provide specialized functionality for image captioning tasks. It leverages an underlying platform's image-to-text capability to generate descriptive captions for images based on a given prompt. The class abstracts the complexity of directly interfacing with the platform and provides a clean interface for executing captioning operations.
Class Attributes
None
Methods:
| Name | Description |
|---|---|
run |
Executes the image captioning process using the configured platform and model. |
run(prompt, media=None, **kwargs)
¶
Execute the image captioning process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt or instruction for image description generation. |
required |
media
|
Optional[List[Media]]
|
List of media objects to process. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional keyword arguments passed to the platform's image2text method. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result from the platform's image-to-text conversion, typically a string or structured data containing the generated caption or description. |
ImageGeneratorAgent
¶
Bases: Agent
Generate images based on textual prompts or modify existing images using prompting.
This class extends the base Agent to specialize in image generation tasks. It supports two primary workflows: generating an image from a text prompt alone, or transforming one or more input images based on a combined text prompt and media input. The actual image generation is delegated to the underlying platform implementation through dispatch to either text2image or image2image methods.
When no media is provided, the agent generates a new image from scratch using only the prompt. When one or more media objects are provided, the agent uses them as input to guide the image transformation process, delegating to an image-to-image pipeline instead.
run(prompt, media=None, response_model=None, **kwargs)
¶
Execute the image captioning process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt or instruction for image description generation. |
required |
media
|
Optional[List[Media]]
|
List of media objects to process. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional keyword arguments passed to the platform's image2text method. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result from the platform's image-to-text conversion, typically a string or structured data containing the generated caption or description. |