`strands.models.ollama` ¶

Ollama model provider.

Docs: https://ollama.com/

`Messages = list[Message]` `module-attribute` ¶

A list of messages representing a conversation.

`StopReason = Literal['content_filtered', 'end_turn', 'guardrail_intervened', 'interrupt', 'max_tokens', 'stop_sequence', 'tool_use']` `module-attribute` ¶

Reason for the model ending its response generation.

"content_filtered": Content was filtered due to policy violation
"end_turn": Normal completion of the response
"guardrail_intervened": Guardrail system intervened
"interrupt": Agent was interrupted for human input
"max_tokens": Maximum token limit reached
"stop_sequence": Stop sequence encountered
"tool_use": Model requested to use a tool

`T = TypeVar('T', bound=BaseModel)` `module-attribute` ¶

`ToolChoice = ToolChoiceAutoDict | ToolChoiceAnyDict | ToolChoiceToolDict` `module-attribute` ¶

Configuration for how the model should choose tools.

"auto": The model decides whether to use tools based on the context
"any": The model must use at least one tool (any tool)
"tool": The model must use the specified tool

`logger = logging.getLogger(name)` `module-attribute` ¶

`ContentBlock` ¶

Bases: TypedDict

A block of content for a message that you pass to, or receive from, a model.

Attributes:

Name	Type	Description
`cachePoint`	`CachePoint`	A cache point configuration to optimize conversation history.
`document`	`DocumentContent`	A document to include in the message.
`guardContent`	`GuardContent`	Contains the content to assess with the guardrail.
`image`	`ImageContent`	Image to include in the message.
`reasoningContent`	`ReasoningContentBlock`	Contains content regarding the reasoning that is carried out by the model.
`text`	`str`	Text to include in the message.
`toolResult`	`ToolResult`	The result for a tool request that a model makes.
`toolUse`	`ToolUse`	Information about a tool use request from a model.
`video`	`VideoContent`	Video to include in the message.
`citationsContent`	`CitationsContentBlock`	Contains the citations for a document.

Source code in strands/types/content.py

class ContentBlock(TypedDict, total=False):
    """A block of content for a message that you pass to, or receive from, a model.

    Attributes:
        cachePoint: A cache point configuration to optimize conversation history.
        document: A document to include in the message.
        guardContent: Contains the content to assess with the guardrail.
        image: Image to include in the message.
        reasoningContent: Contains content regarding the reasoning that is carried out by the model.
        text: Text to include in the message.
        toolResult: The result for a tool request that a model makes.
        toolUse: Information about a tool use request from a model.
        video: Video to include in the message.
        citationsContent: Contains the citations for a document.
    """

    cachePoint: CachePoint
    document: DocumentContent
    guardContent: GuardContent
    image: ImageContent
    reasoningContent: ReasoningContentBlock
    text: str
    toolResult: ToolResult
    toolUse: ToolUse
    video: VideoContent
    citationsContent: CitationsContentBlock

`Model` ¶

Bases: ABC

Abstract base class for Agent model providers.

This class defines the interface for all model implementations in the Strands Agents SDK. It provides a standardized way to configure and process requests for different AI model providers.

Source code in strands/models/model.py

class Model(abc.ABC):
    """Abstract base class for Agent model providers.

    This class defines the interface for all model implementations in the Strands Agents SDK. It provides a
    standardized way to configure and process requests for different AI model providers.
    """

    @abc.abstractmethod
    # pragma: no cover
    def update_config(self, **model_config: Any) -> None:
        """Update the model configuration with the provided arguments.

        Args:
            **model_config: Configuration overrides.
        """
        pass

    @abc.abstractmethod
    # pragma: no cover
    def get_config(self) -> Any:
        """Return the model configuration.

        Returns:
            The model's configuration.
        """
        pass

    @abc.abstractmethod
    # pragma: no cover
    def structured_output(
        self, output_model: type[T], prompt: Messages, system_prompt: str | None = None, **kwargs: Any
    ) -> AsyncGenerator[dict[str, T | Any], None]:
        """Get structured output from the model.

        Args:
            output_model: The output model to use for the agent.
            prompt: The prompt messages to use for the agent.
            system_prompt: System prompt to provide context to the model.
            **kwargs: Additional keyword arguments for future extensibility.

        Yields:
            Model events with the last being the structured output.

        Raises:
            ValidationException: The response format from the model does not match the output_model
        """
        pass

    @abc.abstractmethod
    # pragma: no cover
    def stream(
        self,
        messages: Messages,
        tool_specs: list[ToolSpec] | None = None,
        system_prompt: str | None = None,
        *,
        tool_choice: ToolChoice | None = None,
        system_prompt_content: list[SystemContentBlock] | None = None,
        invocation_state: dict[str, Any] | None = None,
        **kwargs: Any,
    ) -> AsyncIterable[StreamEvent]:
        """Stream conversation with the model.

        This method handles the full lifecycle of conversing with the model:

        1. Format the messages, tool specs, and configuration into a streaming request
        2. Send the request to the model
        3. Yield the formatted message chunks

        Args:
            messages: List of message objects to be processed by the model.
            tool_specs: List of tool specifications to make available to the model.
            system_prompt: System prompt to provide context to the model.
            tool_choice: Selection strategy for tool invocation.
            system_prompt_content: System prompt content blocks for advanced features like caching.
            invocation_state: Caller-provided state/context that was passed to the agent when it was invoked.
            **kwargs: Additional keyword arguments for future extensibility.

        Yields:
            Formatted message chunks from the model.

        Raises:
            ModelThrottledException: When the model service is throttling requests from the client.
        """
        pass

`get_config()` `abstractmethod` ¶

Return the model configuration.

Returns:

Type	Description
`Any`	The model's configuration.

Source code in strands/models/model.py

@abc.abstractmethod
# pragma: no cover
def get_config(self) -> Any:
    """Return the model configuration.

    Returns:
        The model's configuration.
    """
    pass

`stream(messages, tool_specs=None, system_prompt=None, *, tool_choice=None, system_prompt_content=None, invocation_state=None, **kwargs)` `abstractmethod` ¶

Stream conversation with the model.

This method handles the full lifecycle of conversing with the model:

Format the messages, tool specs, and configuration into a streaming request
Send the request to the model
Yield the formatted message chunks

Parameters:

Name	Type	Description	Default
`messages`	`Messages`	List of message objects to be processed by the model.	required
`tool_specs`	`list[ToolSpec] \| None`	List of tool specifications to make available to the model.	`None`
`system_prompt`	`str \| None`	System prompt to provide context to the model.	`None`
`tool_choice`	`ToolChoice \| None`	Selection strategy for tool invocation.	`None`
`system_prompt_content`	`list[SystemContentBlock] \| None`	System prompt content blocks for advanced features like caching.	`None`
`invocation_state`	`dict[str, Any] \| None`	Caller-provided state/context that was passed to the agent when it was invoked.	`None`
`**kwargs`	`Any`	Additional keyword arguments for future extensibility.	`{}`

Yields:

Type	Description
`AsyncIterable[StreamEvent]`	Formatted message chunks from the model.

Raises:

Type	Description
`ModelThrottledException`	When the model service is throttling requests from the client.

Source code in strands/models/model.py

@abc.abstractmethod
# pragma: no cover
def stream(
    self,
    messages: Messages,
    tool_specs: list[ToolSpec] | None = None,
    system_prompt: str | None = None,
    *,
    tool_choice: ToolChoice | None = None,
    system_prompt_content: list[SystemContentBlock] | None = None,
    invocation_state: dict[str, Any] | None = None,
    **kwargs: Any,
) -> AsyncIterable[StreamEvent]:
    """Stream conversation with the model.

    This method handles the full lifecycle of conversing with the model:

    1. Format the messages, tool specs, and configuration into a streaming request
    2. Send the request to the model
    3. Yield the formatted message chunks

    Args:
        messages: List of message objects to be processed by the model.
        tool_specs: List of tool specifications to make available to the model.
        system_prompt: System prompt to provide context to the model.
        tool_choice: Selection strategy for tool invocation.
        system_prompt_content: System prompt content blocks for advanced features like caching.
        invocation_state: Caller-provided state/context that was passed to the agent when it was invoked.
        **kwargs: Additional keyword arguments for future extensibility.

    Yields:
        Formatted message chunks from the model.

    Raises:
        ModelThrottledException: When the model service is throttling requests from the client.
    """
    pass

`structured_output(output_model, prompt, system_prompt=None, **kwargs)` `abstractmethod` ¶

Get structured output from the model.

Parameters:

Name	Type	Description	Default
`output_model`	`type[T]`	The output model to use for the agent.	required
`prompt`	`Messages`	The prompt messages to use for the agent.	required
`system_prompt`	`str \| None`	System prompt to provide context to the model.	`None`
`**kwargs`	`Any`	Additional keyword arguments for future extensibility.	`{}`

Yields:

Type	Description
`AsyncGenerator[dict[str, T \| Any], None]`	Model events with the last being the structured output.

Raises:

Type	Description
`ValidationException`	The response format from the model does not match the output_model

Source code in strands/models/model.py

@abc.abstractmethod
# pragma: no cover
def structured_output(
    self, output_model: type[T], prompt: Messages, system_prompt: str | None = None, **kwargs: Any
) -> AsyncGenerator[dict[str, T | Any], None]:
    """Get structured output from the model.

    Args:
        output_model: The output model to use for the agent.
        prompt: The prompt messages to use for the agent.
        system_prompt: System prompt to provide context to the model.
        **kwargs: Additional keyword arguments for future extensibility.

    Yields:
        Model events with the last being the structured output.

    Raises:
        ValidationException: The response format from the model does not match the output_model
    """
    pass

`update_config(**model_config)` `abstractmethod` ¶

Update the model configuration with the provided arguments.

Parameters:

Name	Type	Description	Default
`**model_config`	`Any`	Configuration overrides.	`{}`

Source code in strands/models/model.py

@abc.abstractmethod
# pragma: no cover
def update_config(self, **model_config: Any) -> None:
    """Update the model configuration with the provided arguments.

    Args:
        **model_config: Configuration overrides.
    """
    pass

`OllamaModel` ¶

Bases: Model

Ollama model provider implementation.

The implementation handles Ollama-specific features such as:

Local model invocation
Streaming responses
Tool/function calling

Source code in strands/models/ollama.py

class OllamaModel(Model):
    """Ollama model provider implementation.

    The implementation handles Ollama-specific features such as:

    - Local model invocation
    - Streaming responses
    - Tool/function calling
    """

    class OllamaConfig(TypedDict, total=False):
        """Configuration parameters for Ollama models.

        Attributes:
            additional_args: Any additional arguments to include in the request.
            keep_alive: Controls how long the model will stay loaded into memory following the request (default: "5m").
            max_tokens: Maximum number of tokens to generate in the response.
            model_id: Ollama model ID (e.g., "llama3", "mistral", "phi3").
            options: Additional model parameters (e.g., top_k).
            stop_sequences: List of sequences that will stop generation when encountered.
            temperature: Controls randomness in generation (higher = more random).
            top_p: Controls diversity via nucleus sampling (alternative to temperature).
        """

        additional_args: dict[str, Any] | None
        keep_alive: str | None
        max_tokens: int | None
        model_id: str
        options: dict[str, Any] | None
        stop_sequences: list[str] | None
        temperature: float | None
        top_p: float | None

    def __init__(
        self,
        host: str | None,
        *,
        ollama_client_args: dict[str, Any] | None = None,
        **model_config: Unpack[OllamaConfig],
    ) -> None:
        """Initialize provider instance.

        Args:
            host: The address of the Ollama server hosting the model.
            ollama_client_args: Additional arguments for the Ollama client.
            **model_config: Configuration options for the Ollama model.
        """
        self.host = host
        self.client_args = ollama_client_args or {}
        validate_config_keys(model_config, self.OllamaConfig)
        self.config = OllamaModel.OllamaConfig(**model_config)

        logger.debug("config=<%s> | initializing", self.config)

    @override
    def update_config(self, **model_config: Unpack[OllamaConfig]) -> None:  # type: ignore
        """Update the Ollama Model configuration with the provided arguments.

        Args:
            **model_config: Configuration overrides.
        """
        validate_config_keys(model_config, self.OllamaConfig)
        self.config.update(model_config)

    @override
    def get_config(self) -> OllamaConfig:
        """Get the Ollama model configuration.

        Returns:
            The Ollama model configuration.
        """
        return self.config

    def _format_request_message_contents(self, role: str, content: ContentBlock) -> list[dict[str, Any]]:
        """Format Ollama compatible message contents.

        Ollama doesn't support an array of contents, so we must flatten everything into separate message blocks.

        Args:
            role: E.g., user.
            content: Content block to format.

        Returns:
            Ollama formatted message contents.

        Raises:
            TypeError: If the content block type cannot be converted to an Ollama-compatible format.
        """
        if "text" in content:
            return [{"role": role, "content": content["text"]}]

        if "image" in content:
            return [{"role": role, "images": [content["image"]["source"]["bytes"]]}]

        if "toolUse" in content:
            return [
                {
                    "role": role,
                    "tool_calls": [
                        {
                            "function": {
                                "name": content["toolUse"]["toolUseId"],
                                "arguments": content["toolUse"]["input"],
                            }
                        }
                    ],
                }
            ]

        if "toolResult" in content:
            return [
                formatted_tool_result_content
                for tool_result_content in content["toolResult"]["content"]
                for formatted_tool_result_content in self._format_request_message_contents(
                    "tool",
                    (
                        {"text": json.dumps(tool_result_content["json"])}
                        if "json" in tool_result_content
                        else cast(ContentBlock, tool_result_content)
                    ),
                )
            ]

        raise TypeError(f"content_type=<{next(iter(content))}> | unsupported type")

    def _format_request_messages(self, messages: Messages, system_prompt: str | None = None) -> list[dict[str, Any]]:
        """Format an Ollama compatible messages array.

        Args:
            messages: List of message objects to be processed by the model.
            system_prompt: System prompt to provide context to the model.

        Returns:
            An Ollama compatible messages array.
        """
        system_message = [{"role": "system", "content": system_prompt}] if system_prompt else []

        formatted_messages = []
        for message in messages:
            for content in message["content"]:
                # Check for location sources and skip with warning
                if _has_location_source(content):
                    logger.warning("Location sources are not supported by Ollama | skipping content block")
                    continue
                formatted_messages.extend(self._format_request_message_contents(message["role"], content))

        return system_message + formatted_messages

    def format_request(
        self, messages: Messages, tool_specs: list[ToolSpec] | None = None, system_prompt: str | None = None
    ) -> dict[str, Any]:
        """Format an Ollama chat streaming request.

        Args:
            messages: List of message objects to be processed by the model.
            tool_specs: List of tool specifications to make available to the model.
            system_prompt: System prompt to provide context to the model.

        Returns:
            An Ollama chat streaming request.

        Raises:
            TypeError: If a message contains a content block type that cannot be converted to an Ollama-compatible
                format.
        """
        return {
            "messages": self._format_request_messages(messages, system_prompt),
            "model": self.config["model_id"],
            "options": {
                **(self.config.get("options") or {}),
                **{
                    key: value
                    for key, value in [
                        ("num_predict", self.config.get("max_tokens")),
                        ("temperature", self.config.get("temperature")),
                        ("top_p", self.config.get("top_p")),
                        ("stop", self.config.get("stop_sequences")),
                    ]
                    if value is not None
                },
            },
            "stream": True,
            "tools": [
                {
                    "type": "function",
                    "function": {
                        "name": tool_spec["name"],
                        "description": tool_spec["description"],
                        "parameters": tool_spec["inputSchema"]["json"],
                    },
                }
                for tool_spec in tool_specs or []
            ],
            **({"keep_alive": self.config["keep_alive"]} if self.config.get("keep_alive") else {}),
            **(
                self.config["additional_args"]
                if "additional_args" in self.config and self.config["additional_args"] is not None
                else {}
            ),
        }

    def format_chunk(self, event: dict[str, Any]) -> StreamEvent:
        """Format the Ollama response events into standardized message chunks.

        Args:
            event: A response event from the Ollama model.

        Returns:
            The formatted chunk.

        Raises:
            RuntimeError: If chunk_type is not recognized.
                This error should never be encountered as we control chunk_type in the stream method.
        """
        match event["chunk_type"]:
            case "message_start":
                return {"messageStart": {"role": "assistant"}}

            case "content_start":
                if event["data_type"] == "text":
                    return {"contentBlockStart": {"start": {}}}

                tool_name = event["data"].function.name
                return {"contentBlockStart": {"start": {"toolUse": {"name": tool_name, "toolUseId": tool_name}}}}

            case "content_delta":
                if event["data_type"] == "text":
                    return {"contentBlockDelta": {"delta": {"text": event["data"]}}}

                tool_arguments = event["data"].function.arguments
                return {"contentBlockDelta": {"delta": {"toolUse": {"input": json.dumps(tool_arguments)}}}}

            case "content_stop":
                return {"contentBlockStop": {}}

            case "message_stop":
                reason: StopReason
                if event["data"] == "tool_use":
                    reason = "tool_use"
                elif event["data"] == "length":
                    reason = "max_tokens"
                else:
                    reason = "end_turn"

                return {"messageStop": {"stopReason": reason}}

            case "metadata":
                return {
                    "metadata": {
                        "usage": {
                            "inputTokens": event["data"].eval_count,
                            "outputTokens": event["data"].prompt_eval_count,
                            "totalTokens": event["data"].eval_count + event["data"].prompt_eval_count,
                        },
                        "metrics": {
                            "latencyMs": event["data"].total_duration / 1e6,
                        },
                    },
                }

            case _:
                raise RuntimeError(f"chunk_type=<{event['chunk_type']} | unknown type")

    @override
    async def stream(
        self,
        messages: Messages,
        tool_specs: list[ToolSpec] | None = None,
        system_prompt: str | None = None,
        *,
        tool_choice: ToolChoice | None = None,
        **kwargs: Any,
    ) -> AsyncGenerator[StreamEvent, None]:
        """Stream conversation with the Ollama model.

        Args:
            messages: List of message objects to be processed by the model.
            tool_specs: List of tool specifications to make available to the model.
            system_prompt: System prompt to provide context to the model.
            tool_choice: Selection strategy for tool invocation. **Note: This parameter is accepted for
                interface consistency but is currently ignored for this model provider.**
            **kwargs: Additional keyword arguments for future extensibility.

        Yields:
            Formatted message chunks from the model.
        """
        warn_on_tool_choice_not_supported(tool_choice)

        logger.debug("formatting request")
        request = self.format_request(messages, tool_specs, system_prompt)
        logger.debug("request=<%s>", request)

        logger.debug("invoking model")
        tool_requested = False

        client = ollama.AsyncClient(self.host, **self.client_args)
        response = await client.chat(**request)

        logger.debug("got response from model")
        yield self.format_chunk({"chunk_type": "message_start"})
        yield self.format_chunk({"chunk_type": "content_start", "data_type": "text"})

        async for event in response:
            for tool_call in event.message.tool_calls or []:
                yield self.format_chunk({"chunk_type": "content_start", "data_type": "tool", "data": tool_call})
                yield self.format_chunk({"chunk_type": "content_delta", "data_type": "tool", "data": tool_call})
                yield self.format_chunk({"chunk_type": "content_stop", "data_type": "tool", "data": tool_call})
                tool_requested = True

            yield self.format_chunk({"chunk_type": "content_delta", "data_type": "text", "data": event.message.content})

        yield self.format_chunk({"chunk_type": "content_stop", "data_type": "text"})
        yield self.format_chunk(
            {"chunk_type": "message_stop", "data": "tool_use" if tool_requested else event.done_reason}
        )
        yield self.format_chunk({"chunk_type": "metadata", "data": event})

        logger.debug("finished streaming response from model")

    @override
    async def structured_output(
        self, output_model: type[T], prompt: Messages, system_prompt: str | None = None, **kwargs: Any
    ) -> AsyncGenerator[dict[str, T | Any], None]:
        """Get structured output from the model.

        Args:
            output_model: The output model to use for the agent.
            prompt: The prompt messages to use for the agent.
            system_prompt: System prompt to provide context to the model.
            **kwargs: Additional keyword arguments for future extensibility.

        Yields:
            Model events with the last being the structured output.
        """
        formatted_request = self.format_request(messages=prompt, system_prompt=system_prompt)
        formatted_request["format"] = output_model.model_json_schema()
        formatted_request["stream"] = False

        client = ollama.AsyncClient(self.host, **self.client_args)
        response = await client.chat(**formatted_request)

        try:
            content = response.message.content.strip()
            yield {"output": output_model.model_validate_json(content)}
        except Exception as e:
            raise ValueError(f"Failed to parse or load content into model: {e}") from e

`OllamaConfig` ¶

Bases: TypedDict

Configuration parameters for Ollama models.

Attributes:

Name	Type	Description
`additional_args`	`dict[str, Any] \| None`	Any additional arguments to include in the request.
`keep_alive`	`str \| None`	Controls how long the model will stay loaded into memory following the request (default: "5m").
`max_tokens`	`int \| None`	Maximum number of tokens to generate in the response.
`model_id`	`str`	Ollama model ID (e.g., "llama3", "mistral", "phi3").
`options`	`dict[str, Any] \| None`	Additional model parameters (e.g., top_k).
`stop_sequences`	`list[str] \| None`	List of sequences that will stop generation when encountered.
`temperature`	`float \| None`	Controls randomness in generation (higher = more random).
`top_p`	`float \| None`	Controls diversity via nucleus sampling (alternative to temperature).

Source code in strands/models/ollama.py

class OllamaConfig(TypedDict, total=False):
    """Configuration parameters for Ollama models.

    Attributes:
        additional_args: Any additional arguments to include in the request.
        keep_alive: Controls how long the model will stay loaded into memory following the request (default: "5m").
        max_tokens: Maximum number of tokens to generate in the response.
        model_id: Ollama model ID (e.g., "llama3", "mistral", "phi3").
        options: Additional model parameters (e.g., top_k).
        stop_sequences: List of sequences that will stop generation when encountered.
        temperature: Controls randomness in generation (higher = more random).
        top_p: Controls diversity via nucleus sampling (alternative to temperature).
    """

    additional_args: dict[str, Any] | None
    keep_alive: str | None
    max_tokens: int | None
    model_id: str
    options: dict[str, Any] | None
    stop_sequences: list[str] | None
    temperature: float | None
    top_p: float | None

`init(host, *, ollama_client_args=None, **model_config)` ¶

Initialize provider instance.

Parameters:

Name	Type	Description	Default
`host`	`str \| None`	The address of the Ollama server hosting the model.	required
`ollama_client_args`	`dict[str, Any] \| None`	Additional arguments for the Ollama client.	`None`
`**model_config`	`Unpack[OllamaConfig]`	Configuration options for the Ollama model.	`{}`

Source code in strands/models/ollama.py

def __init__(
    self,
    host: str | None,
    *,
    ollama_client_args: dict[str, Any] | None = None,
    **model_config: Unpack[OllamaConfig],
) -> None:
    """Initialize provider instance.

    Args:
        host: The address of the Ollama server hosting the model.
        ollama_client_args: Additional arguments for the Ollama client.
        **model_config: Configuration options for the Ollama model.
    """
    self.host = host
    self.client_args = ollama_client_args or {}
    validate_config_keys(model_config, self.OllamaConfig)
    self.config = OllamaModel.OllamaConfig(**model_config)

    logger.debug("config=<%s> | initializing", self.config)

`format_chunk(event)` ¶

Format the Ollama response events into standardized message chunks.

Parameters:

Name	Type	Description	Default
`event`	`dict[str, Any]`	A response event from the Ollama model.	required

Returns:

Type	Description
`StreamEvent`	The formatted chunk.

Raises:

Type	Description
`RuntimeError`	If chunk_type is not recognized. This error should never be encountered as we control chunk_type in the stream method.

Source code in strands/models/ollama.py

def format_chunk(self, event: dict[str, Any]) -> StreamEvent:
    """Format the Ollama response events into standardized message chunks.

    Args:
        event: A response event from the Ollama model.

    Returns:
        The formatted chunk.

    Raises:
        RuntimeError: If chunk_type is not recognized.
            This error should never be encountered as we control chunk_type in the stream method.
    """
    match event["chunk_type"]:
        case "message_start":
            return {"messageStart": {"role": "assistant"}}

        case "content_start":
            if event["data_type"] == "text":
                return {"contentBlockStart": {"start": {}}}

            tool_name = event["data"].function.name
            return {"contentBlockStart": {"start": {"toolUse": {"name": tool_name, "toolUseId": tool_name}}}}

        case "content_delta":
            if event["data_type"] == "text":
                return {"contentBlockDelta": {"delta": {"text": event["data"]}}}

            tool_arguments = event["data"].function.arguments
            return {"contentBlockDelta": {"delta": {"toolUse": {"input": json.dumps(tool_arguments)}}}}

        case "content_stop":
            return {"contentBlockStop": {}}

        case "message_stop":
            reason: StopReason
            if event["data"] == "tool_use":
                reason = "tool_use"
            elif event["data"] == "length":
                reason = "max_tokens"
            else:
                reason = "end_turn"

            return {"messageStop": {"stopReason": reason}}

        case "metadata":
            return {
                "metadata": {
                    "usage": {
                        "inputTokens": event["data"].eval_count,
                        "outputTokens": event["data"].prompt_eval_count,
                        "totalTokens": event["data"].eval_count + event["data"].prompt_eval_count,
                    },
                    "metrics": {
                        "latencyMs": event["data"].total_duration / 1e6,
                    },
                },
            }

        case _:
            raise RuntimeError(f"chunk_type=<{event['chunk_type']} | unknown type")

`format_request(messages, tool_specs=None, system_prompt=None)` ¶

Format an Ollama chat streaming request.

Parameters:

Name	Type	Description	Default
`messages`	`Messages`	List of message objects to be processed by the model.	required
`tool_specs`	`list[ToolSpec] \| None`	List of tool specifications to make available to the model.	`None`
`system_prompt`	`str \| None`	System prompt to provide context to the model.	`None`

Returns:

Type	Description
`dict[str, Any]`	An Ollama chat streaming request.

Raises:

Type	Description
`TypeError`	If a message contains a content block type that cannot be converted to an Ollama-compatible format.

Source code in strands/models/ollama.py

def format_request(
    self, messages: Messages, tool_specs: list[ToolSpec] | None = None, system_prompt: str | None = None
) -> dict[str, Any]:
    """Format an Ollama chat streaming request.

    Args:
        messages: List of message objects to be processed by the model.
        tool_specs: List of tool specifications to make available to the model.
        system_prompt: System prompt to provide context to the model.

    Returns:
        An Ollama chat streaming request.

    Raises:
        TypeError: If a message contains a content block type that cannot be converted to an Ollama-compatible
            format.
    """
    return {
        "messages": self._format_request_messages(messages, system_prompt),
        "model": self.config["model_id"],
        "options": {
            **(self.config.get("options") or {}),
            **{
                key: value
                for key, value in [
                    ("num_predict", self.config.get("max_tokens")),
                    ("temperature", self.config.get("temperature")),
                    ("top_p", self.config.get("top_p")),
                    ("stop", self.config.get("stop_sequences")),
                ]
                if value is not None
            },
        },
        "stream": True,
        "tools": [
            {
                "type": "function",
                "function": {
                    "name": tool_spec["name"],
                    "description": tool_spec["description"],
                    "parameters": tool_spec["inputSchema"]["json"],
                },
            }
            for tool_spec in tool_specs or []
        ],
        **({"keep_alive": self.config["keep_alive"]} if self.config.get("keep_alive") else {}),
        **(
            self.config["additional_args"]
            if "additional_args" in self.config and self.config["additional_args"] is not None
            else {}
        ),
    }

`get_config()` ¶

Get the Ollama model configuration.

Returns:

Type	Description
`OllamaConfig`	The Ollama model configuration.

Source code in strands/models/ollama.py

@override
def get_config(self) -> OllamaConfig:
    """Get the Ollama model configuration.

    Returns:
        The Ollama model configuration.
    """
    return self.config

`stream(messages, tool_specs=None, system_prompt=None, *, tool_choice=None, **kwargs)` `async` ¶

Stream conversation with the Ollama model.

Parameters:

Name	Type	Description	Default
`messages`	`Messages`	List of message objects to be processed by the model.	required
`tool_specs`	`list[ToolSpec] \| None`	List of tool specifications to make available to the model.	`None`
`system_prompt`	`str \| None`	System prompt to provide context to the model.	`None`
`tool_choice`	`ToolChoice \| None`	Selection strategy for tool invocation. Note: This parameter is accepted for interface consistency but is currently ignored for this model provider.	`None`
`**kwargs`	`Any`	Additional keyword arguments for future extensibility.	`{}`

Yields:

Type	Description
`AsyncGenerator[StreamEvent, None]`	Formatted message chunks from the model.

Source code in strands/models/ollama.py

@override
async def stream(
    self,
    messages: Messages,
    tool_specs: list[ToolSpec] | None = None,
    system_prompt: str | None = None,
    *,
    tool_choice: ToolChoice | None = None,
    **kwargs: Any,
) -> AsyncGenerator[StreamEvent, None]:
    """Stream conversation with the Ollama model.

    Args:
        messages: List of message objects to be processed by the model.
        tool_specs: List of tool specifications to make available to the model.
        system_prompt: System prompt to provide context to the model.
        tool_choice: Selection strategy for tool invocation. **Note: This parameter is accepted for
            interface consistency but is currently ignored for this model provider.**
        **kwargs: Additional keyword arguments for future extensibility.

    Yields:
        Formatted message chunks from the model.
    """
    warn_on_tool_choice_not_supported(tool_choice)

    logger.debug("formatting request")
    request = self.format_request(messages, tool_specs, system_prompt)
    logger.debug("request=<%s>", request)

    logger.debug("invoking model")
    tool_requested = False

    client = ollama.AsyncClient(self.host, **self.client_args)
    response = await client.chat(**request)

    logger.debug("got response from model")
    yield self.format_chunk({"chunk_type": "message_start"})
    yield self.format_chunk({"chunk_type": "content_start", "data_type": "text"})

    async for event in response:
        for tool_call in event.message.tool_calls or []:
            yield self.format_chunk({"chunk_type": "content_start", "data_type": "tool", "data": tool_call})
            yield self.format_chunk({"chunk_type": "content_delta", "data_type": "tool", "data": tool_call})
            yield self.format_chunk({"chunk_type": "content_stop", "data_type": "tool", "data": tool_call})
            tool_requested = True

        yield self.format_chunk({"chunk_type": "content_delta", "data_type": "text", "data": event.message.content})

    yield self.format_chunk({"chunk_type": "content_stop", "data_type": "text"})
    yield self.format_chunk(
        {"chunk_type": "message_stop", "data": "tool_use" if tool_requested else event.done_reason}
    )
    yield self.format_chunk({"chunk_type": "metadata", "data": event})

    logger.debug("finished streaming response from model")

`structured_output(output_model, prompt, system_prompt=None, **kwargs)` `async` ¶

Get structured output from the model.

Parameters:

Name	Type	Description	Default
`output_model`	`type[T]`	The output model to use for the agent.	required
`prompt`	`Messages`	The prompt messages to use for the agent.	required
`system_prompt`	`str \| None`	System prompt to provide context to the model.	`None`
`**kwargs`	`Any`	Additional keyword arguments for future extensibility.	`{}`

Yields:

Type	Description
`AsyncGenerator[dict[str, T \| Any], None]`	Model events with the last being the structured output.

Source code in strands/models/ollama.py

@override
async def structured_output(
    self, output_model: type[T], prompt: Messages, system_prompt: str | None = None, **kwargs: Any
) -> AsyncGenerator[dict[str, T | Any], None]:
    """Get structured output from the model.

    Args:
        output_model: The output model to use for the agent.
        prompt: The prompt messages to use for the agent.
        system_prompt: System prompt to provide context to the model.
        **kwargs: Additional keyword arguments for future extensibility.

    Yields:
        Model events with the last being the structured output.
    """
    formatted_request = self.format_request(messages=prompt, system_prompt=system_prompt)
    formatted_request["format"] = output_model.model_json_schema()
    formatted_request["stream"] = False

    client = ollama.AsyncClient(self.host, **self.client_args)
    response = await client.chat(**formatted_request)

    try:
        content = response.message.content.strip()
        yield {"output": output_model.model_validate_json(content)}
    except Exception as e:
        raise ValueError(f"Failed to parse or load content into model: {e}") from e

`update_config(**model_config)` ¶

Update the Ollama Model configuration with the provided arguments.

Parameters:

Name	Type	Description	Default
`**model_config`	`Unpack[OllamaConfig]`	Configuration overrides.	`{}`

Source code in strands/models/ollama.py

@override
def update_config(self, **model_config: Unpack[OllamaConfig]) -> None:  # type: ignore
    """Update the Ollama Model configuration with the provided arguments.

    Args:
        **model_config: Configuration overrides.
    """
    validate_config_keys(model_config, self.OllamaConfig)
    self.config.update(model_config)

`StreamEvent` ¶

Bases: TypedDict

The messages output stream.

Attributes:

Name	Type	Description
`contentBlockDelta`	`ContentBlockDeltaEvent`	Delta content for a content block.
`contentBlockStart`	`ContentBlockStartEvent`	Start of a content block.
`contentBlockStop`	`ContentBlockStopEvent`	End of a content block.
`internalServerException`	`ExceptionEvent`	Internal server error information.
`messageStart`	`MessageStartEvent`	Start of a message.
`messageStop`	`MessageStopEvent`	End of a message.
`metadata`	`MetadataEvent`	Metadata about the streaming response.
`modelStreamErrorException`	`ModelStreamErrorEvent`	Model streaming error information.
`serviceUnavailableException`	`ExceptionEvent`	Service unavailable error information.
`throttlingException`	`ExceptionEvent`	Throttling error information.
`validationException`	`ExceptionEvent`	Validation error information.

Source code in strands/types/streaming.py

class StreamEvent(TypedDict, total=False):
    """The messages output stream.

    Attributes:
        contentBlockDelta: Delta content for a content block.
        contentBlockStart: Start of a content block.
        contentBlockStop: End of a content block.
        internalServerException: Internal server error information.
        messageStart: Start of a message.
        messageStop: End of a message.
        metadata: Metadata about the streaming response.
        modelStreamErrorException: Model streaming error information.
        serviceUnavailableException: Service unavailable error information.
        throttlingException: Throttling error information.
        validationException: Validation error information.
    """

    contentBlockDelta: ContentBlockDeltaEvent
    contentBlockStart: ContentBlockStartEvent
    contentBlockStop: ContentBlockStopEvent
    internalServerException: ExceptionEvent
    messageStart: MessageStartEvent
    messageStop: MessageStopEvent
    metadata: MetadataEvent
    redactContent: RedactContentEvent
    modelStreamErrorException: ModelStreamErrorEvent
    serviceUnavailableException: ExceptionEvent
    throttlingException: ExceptionEvent
    validationException: ExceptionEvent

`ToolSpec` ¶

Bases: TypedDict

Specification for a tool that can be used by an agent.

Attributes:

Name	Type	Description
`description`	`str`	A human-readable description of what the tool does.
`inputSchema`	`JSONSchema`	JSON Schema defining the expected input parameters.
`name`	`str`	The unique name of the tool.
`outputSchema`	`NotRequired[JSONSchema]`	Optional JSON Schema defining the expected output format. Note: Not all model providers support this field. Providers that don't support it should filter it out before sending to their API.

Source code in strands/types/tools.py

class ToolSpec(TypedDict):
    """Specification for a tool that can be used by an agent.

    Attributes:
        description: A human-readable description of what the tool does.
        inputSchema: JSON Schema defining the expected input parameters.
        name: The unique name of the tool.
        outputSchema: Optional JSON Schema defining the expected output format.
            Note: Not all model providers support this field. Providers that don't
            support it should filter it out before sending to their API.
    """

    description: str
    inputSchema: JSONSchema
    name: str
    outputSchema: NotRequired[JSONSchema]

`_has_location_source(content)` ¶

Check if a content block contains a location source.

Providers need to explicitly define an implementation to support content locations.

Parameters:

Name	Type	Description	Default
`content`	`ContentBlock`	Content block to check.	required

Returns:

Type	Description
`bool`	True if the content block contains an location source, False otherwise.

Source code in strands/models/_validation.py

def _has_location_source(content: ContentBlock) -> bool:
    """Check if a content block contains a location source.

    Providers need to explicitly define an implementation to support content locations.

    Args:
        content: Content block to check.

    Returns:
        True if the content block contains an location source, False otherwise.
    """
    if "image" in content:
        return "location" in content["image"].get("source", {})
    if "document" in content:
        return "location" in content["document"].get("source", {})
    if "video" in content:
        return "location" in content["video"].get("source", {})
    return False

`validate_config_keys(config_dict, config_class)` ¶

Validate that config keys match the TypedDict fields.

Parameters:

Name	Type	Description	Default
`config_dict`	`Mapping[str, Any]`	Dictionary of configuration parameters	required
`config_class`	`type`	TypedDict class to validate against	required

Source code in strands/models/_validation.py

def validate_config_keys(config_dict: Mapping[str, Any], config_class: type) -> None:
    """Validate that config keys match the TypedDict fields.

    Args:
        config_dict: Dictionary of configuration parameters
        config_class: TypedDict class to validate against
    """
    valid_keys = set(get_type_hints(config_class).keys())
    provided_keys = set(config_dict.keys())
    invalid_keys = provided_keys - valid_keys

    if invalid_keys:
        warnings.warn(
            f"Invalid configuration parameters: {sorted(invalid_keys)}."
            f"\nValid parameters are: {sorted(valid_keys)}."
            f"\n"
            f"\nSee https://github.com/strands-agents/sdk-python/issues/815",
            stacklevel=4,
        )

`warn_on_tool_choice_not_supported(tool_choice)` ¶

Emits a warning if a tool choice is provided but not supported by the provider.

Parameters:

Name	Type	Description	Default
`tool_choice`	`ToolChoice \| None`	the tool_choice provided to the provider	required

Source code in strands/models/_validation.py

def warn_on_tool_choice_not_supported(tool_choice: ToolChoice | None) -> None:
    """Emits a warning if a tool choice is provided but not supported by the provider.

    Args:
        tool_choice: the tool_choice provided to the provider
    """
    if tool_choice:
        warnings.warn(
            "A ToolChoice was provided to this provider but is not supported and will be ignored",
            stacklevel=4,
        )

strands.models.ollama ¶

Messages = list[Message] module-attribute ¶

StopReason = Literal['content_filtered', 'end_turn', 'guardrail_intervened', 'interrupt', 'max_tokens', 'stop_sequence', 'tool_use'] module-attribute ¶

T = TypeVar('T', bound=BaseModel) module-attribute ¶

ToolChoice = ToolChoiceAutoDict | ToolChoiceAnyDict | ToolChoiceToolDict module-attribute ¶

logger = logging.getLogger(__name__) module-attribute ¶

ContentBlock ¶

Model ¶

get_config() abstractmethod ¶

stream(messages, tool_specs=None, system_prompt=None, *, tool_choice=None, system_prompt_content=None, invocation_state=None, **kwargs) abstractmethod ¶

structured_output(output_model, prompt, system_prompt=None, **kwargs) abstractmethod ¶

update_config(**model_config) abstractmethod ¶

OllamaModel ¶

OllamaConfig ¶

__init__(host, *, ollama_client_args=None, **model_config) ¶

format_chunk(event) ¶

format_request(messages, tool_specs=None, system_prompt=None) ¶

get_config() ¶

stream(messages, tool_specs=None, system_prompt=None, *, tool_choice=None, **kwargs) async ¶

structured_output(output_model, prompt, system_prompt=None, **kwargs) async ¶

update_config(**model_config) ¶

StreamEvent ¶

ToolSpec ¶

_has_location_source(content) ¶

validate_config_keys(config_dict, config_class) ¶

warn_on_tool_choice_not_supported(tool_choice) ¶

`strands.models.ollama` ¶

`Messages = list[Message]` `module-attribute` ¶

`StopReason = Literal['content_filtered', 'end_turn', 'guardrail_intervened', 'interrupt', 'max_tokens', 'stop_sequence', 'tool_use']` `module-attribute` ¶

`T = TypeVar('T', bound=BaseModel)` `module-attribute` ¶

`ToolChoice = ToolChoiceAutoDict | ToolChoiceAnyDict | ToolChoiceToolDict` `module-attribute` ¶

`logger = logging.getLogger(name)` `module-attribute` ¶

`ContentBlock` ¶

`Model` ¶

`get_config()` `abstractmethod` ¶

`stream(messages, tool_specs=None, system_prompt=None, *, tool_choice=None, system_prompt_content=None, invocation_state=None, **kwargs)` `abstractmethod` ¶

`structured_output(output_model, prompt, system_prompt=None, **kwargs)` `abstractmethod` ¶

`update_config(**model_config)` `abstractmethod` ¶

`OllamaModel` ¶

`OllamaConfig` ¶

`init(host, *, ollama_client_args=None, **model_config)` ¶

`format_chunk(event)` ¶

`format_request(messages, tool_specs=None, system_prompt=None)` ¶

`get_config()` ¶

`stream(messages, tool_specs=None, system_prompt=None, *, tool_choice=None, **kwargs)` `async` ¶

`structured_output(output_model, prompt, system_prompt=None, **kwargs)` `async` ¶

`update_config(**model_config)` ¶

`StreamEvent` ¶

`ToolSpec` ¶

`_has_location_source(content)` ¶

`validate_config_keys(config_dict, config_class)` ¶

`warn_on_tool_choice_not_supported(tool_choice)` ¶