Skip to content

Platform

Platform

Bases: ABC

Abstract base class representing a platform for multimodal AI operations.

This class defines a unified interface for interacting with various AI platforms that support text-to-text, text-to-image, image-to-text, and image-to-image conversions. It handles model resolution via ModelRegistry and delegates platform-specific logic to abstract methods, enabling consistent usage across different providers while allowing custom implementations for each backend.

The class provides high-level public methods (e.g., text2text, image2image) that resolve models and invoke corresponding protected abstract methods (_text2text, _image2image, etc.). These abstract methods must be implemented by concrete subclasses to support specific AI platforms.

Each public method invokes ModelRegistry.getModelByName() internally to fetch the appropriate model object for the given platform before calling its private counterpart. This ensures centralized model management and abstraction over platform-specific model representations.

Concrete implementations are required to define all abstract methods, ensuring complete coverage of the supported modalities and behaviors.

Concrete subclasses are expected to implement platform-specific behavior while preserving contract guarantees such as returning the correct output types and handling empty media lists (e.g., image2text raises ValueError when media list is empty).

Methods for structured output (text2text with response_model) delegate to _text2data internally, decoupling structured response handling from plain text flow.

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

CloudflarePlatform

Bases: Platform

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

GroqPlatform

Bases: Platform

Implementation of the Platform interface for the Groq cloud inference service.

This class provides concrete implementations for text-to-text, structured data extraction from text, and image-to-text inference using Groq's API. It supports both unstructured text generation and structured JSON output with automatic retry on JSON parsing failures.

The class handles API key management, model name resolution via Model objects, and interacts with Groq's chat completions endpoint for text and multimodal inputs. It does not support image generation or image-to- image transformations, which are explicitly marked as unsupported.

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

HuggingFacePlatform

Bases: Platform

Platform implementation for interacting with Hugging Face's inference APIs.

This class provides functionality to interact with various Hugging Face models for text-to-text, text-to-data (structured output), and text-to-image generation tasks via the Hugging Face Inference API. It handles API key management, error logging, and structured output retries for JSON schema-constrained responses.

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

DrawThingsPlatform

Bases: Platform

Platform implementation for DrawThings, a text-to-image and image-to-image generation service.

This class extends the generic Platform interface to provide specific functionality for interacting with the DrawThings API (Stable Diffusion WebUI REST API). It supports converting text prompts into images and transforming existing images based on textual descriptions, leveraging machine learning models hosted via the DrawThings endpoint.

The platform handles API communication, parameter configuration, error logging, and media conversion between base64-encoded images and ImageMedia objects.

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

LMStudioPlatform

Bases: Platform

Interface for interacting with local models via LM Studio's OpenAI-compatible API.

This class extends Platform to support text-to-text, structured text-to-data, and image-to-text operations using LM Studio as the backend. It automatically configures an OpenAI-compatible client pointing to LM Studio's local server endpoint (typically localhost:1234). Image generation and image-to-image transformations are currently unsupported and will raise NotImplementedError.

LM Studio must be running locally with the target model loaded and exposed through its API server before using this platform.

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

OllamaPlatform

Bases: Platform

Platform implementation for interacting with Ollama models.

This class provides concrete implementations of platform methods for text-to-text, text-to-data (structured output), and image-to-text operations using the Ollama LLM service. It handles client initialization, optional system prompts, structured output formatting with retries on JSON parsing failures, and image encoding for multimodal inputs. Audio and image generation capabilities are not supported and implemented as no-op stubs.

:var host: The Ollama server endpoint (default: "127.0.0.1:11434"). :type host: str

init(self, host: str = "127.0.0.1:11434", **kwargs: Any) -> None

Initialize the Ollama platform client.

:param host: The endpoint URL of the Ollama server, defaults to "127.0.0.1:11434"
:type host: str
:param kwargs: Additional keyword arguments passed to the parent Platform class

_text2text(self, model: Model, prompt: str, media: Optional[List[Media]] = None, response_model: Optional[BaseModel] = None, **kwargs: Any) -> str

Generate text response from a given prompt using Ollama.

:param model: The model configuration to use for inference
:type model: Model
:param prompt: User prompt text
:type prompt: str
:param media: Unused in this implementation; allowed for interface compliance
:type media: Optional[List[Media]]
:param response_model: Unused in this implementation; allowed for interface compliance
:type response_model: Optional[BaseModel]
:param kwargs: Additional parameters such as 'system_prompt', 'temperature'
:type kwargs: Any
:return: The generated text response from the model
:rtype: str

_text2data(self, model: Model, prompt: str, response_model: BaseModel, media: Optional[List[Media]] = None, **kwargs: Any) -> str

Generate structured data (as JSON-compatible dict) from a prompt using Ollama.

Uses model's JSON schema to enforce structured output via the 'format' argument.
Retries up to three times in case of JSON parsing errors.

:param model: The model configuration to use for inference
:type model: Model
:param prompt: User prompt text
:type prompt: str
:param response_model: Pydantic model defining expected JSON structure
:type response_model: BaseModel
:param media: Unused in this implementation; allowed for interface compliance
:type media: Optional[List[Media]]
:param kwargs: Additional parameters such as 'system_prompt', 'temperature'
:type kwargs: Any
:return: A dictionary representation of the structured output
:rtype: str

_image2text(self, model: Model, prompt: str, media: List[ImageMedia], **kwargs: Any) -> str

Generate text description of an input image using Ollama.

Accepts the first image in the media list, encodes it to Base64, and passes
it alongside the prompt for multimodal inference.

:param model: The multimodal-capable model configuration to use
:type model: Model
:param prompt: User instruction about the image
:type prompt: str
:param media: List containing exactly one image to analyze
:type media: List[ImageMedia]
:param kwargs: Additional parameters such as 'temperature'
:type kwargs: Any
:return: Textual response about the image
:rtype: str

_text2image(self, model: str, prompt: str, **kwargs: Any) -> Image.Image

Placeholder for image generation; currently unsupported.

:param model: Model name (unused)
:type model: str
:param prompt: Prompt for image generation (unused)
:type prompt: str
:param kwargs: Additional arguments (unused)
:type kwargs: Any
:return: None; function is not implemented
:rtype: Optional[Image.Image]

_image2image(self, model: str, prompt: str, image: Image.Image, **kwargs: Any) -> Image.Image

Placeholder for image-to-image translation; currently unsupported.

:param model: Model name (unused)
:type model: str
:param prompt: Prompt instruction (unused)
:type prompt: str
:param image: Input image (unused)
:type image: Image.Image
:param kwargs: Additional arguments (unused)
:type kwargs: Any
:return: None; function is not implemented
:rtype: Optional[Image.Image]

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data

TogetherAiPlatform

Bases: Platform

Integration platform for Together AI services using the OpenAI-compatible API.

This class provides a concrete implementation of the Platform interface, allowing interaction with Together AI models for text generation, structured data extraction, and vision-to-text tasks. It leverages the openai Python library to communicate with Together AI's inference endpoints.

Attributes:

Name Type Description
api_key str

The API key used for authenticating requests to Together AI.

models List[Model]

A list containing the supported models for this platform, defaulting to gpt-oss-20b.

Methods:

Name Description
_text2text

Sends a text prompt to a model and returns a string response.

_text2data

Sends a text prompt and returns structured data validated against a Pydantic model with automatic retry logic on JSON failure.

_image2text

Analyzes an image alongside a text prompt (Vision).

_text2image

(Not supported) Placeholder for future implementation.

_image2image

(Not supported) Placeholder for future implementation.

Note

The _text2data method currently references self._host and an LM Studio style URL in its implementation, which may require adjustment to align with Together AI's standard production endpoints.

image2image(model, prompt, media, **kwargs)

Convert image to image (image editing/transformations).

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the transformation

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Transformed image media object

image2text(model, prompt, media, **kwargs)

Convert image to text.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt guiding the image analysis

required
media List[ImageMedia]

List of ImageMedia objects to process

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
str str

Text description or caption of the image(s)

Raises:

Type Description
ValueError

If media list is empty

text2image(model, prompt, **kwargs)

Convert text to image.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
ImageMedia ImageMedia

Generated image media object

text2text(model, prompt, media=None, response_model=None, **kwargs)

Convert text to text with optional structured output.

Parameters:

Name Type Description Default
model str

The model identifier to use

required
prompt str

The input text prompt

required
media Optional[List[Media]]

Optional list of media objects

None
response_model Optional[str]

Optional Pydantic model for structured output

None
**kwargs Any

Additional platform-specific arguments

{}

Returns:

Name Type Description
Any Any

Text response or structured data