Ollama¶

Ollama is a framework for running open-source large language models locally. Strands provides native support for Ollama, allowing you to use locally-hosted models in your agents.

The OllamaModel class in Strands enables seamless integration with Ollama's API, supporting:

Text generation
Image understanding
Tool/function calling
Streaming responses
Configuration management

Getting Started¶

Prerequisites¶

First install the python client into your python environment:

pip install 'strands-agents[ollama]'

Next, you'll need to install and setup ollama itself.

Option 1: Native Installation¶

Install Ollama by following the instructions at ollama.ai
Pull your desired model:
```
ollama pull llama3
```
Start the Ollama server:
```
ollama serve
```

Option 2: Docker Installation¶

Pull the Ollama Docker image:
```
docker pull ollama/ollama
```

Run the Ollama container:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Note: Add --gpus=all if you have a GPU and if Docker GPU support is configured.

Pull a model using the Docker container:

docker exec -it ollama ollama pull llama3

Verify the Ollama server is running:
```
curl http://localhost:11434/api/tags
```

Basic Usage¶

Here's how to create an agent using an Ollama model:

from strands import Agent
from strands.models.ollama import OllamaModel

# Create an Ollama model instance
ollama_model = OllamaModel(
    host="http://localhost:11434",  # Ollama server address
    model_id="llama3"               # Specify which model to use
)

# Create an agent using the Ollama model
agent = Agent(model=ollama_model)

# Use the agent
agent("Tell me about Strands agents.") # Prints model output to stdout by default

Configuration Options¶

The OllamaModel supports various configuration parameters:

Parameter	Description	Default
`host`	The address of the Ollama server	Required
`model_id`	The Ollama model identifier	Required
`keep_alive`	How long the model stays loaded in memory	"5m"
`max_tokens`	Maximum number of tokens to generate	None
`temperature`	Controls randomness (higher = more random)	None
`top_p`	Controls diversity via nucleus sampling	None
`stop_sequences`	List of sequences that stop generation	None
`options`	Additional model parameters (e.g., top_k)	None
`additional_args`	Any additional arguments for the request	None

Example with Configuration¶

from strands import Agent
from strands.models.ollama import OllamaModel

# Create a configured Ollama model
ollama_model = OllamaModel(
    host="http://localhost:11434",
    model_id="llama3",
    temperature=0.7,
    keep_alive="10m",
    stop_sequences=["###", "END"],
    options={"top_k": 40}
)

# Create an agent with the configured model
agent = Agent(model=ollama_model)

# Use the agent
response = agent("Write a short story about an AI assistant.")

Advanced Features¶

Updating Configuration at Runtime¶

You can update the model configuration during runtime:

# Create the model with initial configuration
ollama_model = OllamaModel(
    host="http://localhost:11434",
    model_id="llama3",
    temperature=0.7
)

# Update configuration later
ollama_model.update_config(
    temperature=0.9,
    top_p=0.8
)

This is especially useful if you want a tool to update the model's config for you:

@tool
def update_model_id(model_id: str, agent: Agent) -> str:
    """
    Update the model id of the agent

    Args:
      model_id: Ollama model id to use.
    """
    print(f"Updating model_id to {model_id}")
    agent.model.update_config(model_id=model_id)
    return f"Model updated to {model_id}"


@tool
def update_temperature(temperature: float, agent: Agent) -> str:
    """
    Update the temperature of the agent

    Args:
      temperature: Temperature value for the model to use.
    """
    print(f"Updating Temperature to {temperature}")
    agent.model.update_config(temperature=temperature)
    return f"Temperature updated to {temperature}"

Using Different Models¶

Ollama supports many different models. You can switch between them (make sure they are pulled first). See the list of available models here: https://ollama.com/search

# Create models for different use cases
creative_model = OllamaModel(
    host="http://localhost:11434",
    model_id="llama3",
    temperature=0.8
)

factual_model = OllamaModel(
    host="http://localhost:11434",
    model_id="mistral",
    temperature=0.2
)

# Create agents with different models
creative_agent = Agent(model=creative_model)
factual_agent = Agent(model=factual_model)

Tool Support¶

Ollama models that support tool use can use tools through Strands's tool system:

from strands import Agent
from strands.models.ollama import OllamaModel
from strands_tools import calculator, current_time

# Create an Ollama model
ollama_model = OllamaModel(
    host="http://localhost:11434",
    model_id="llama3"
)

# Create an agent with tools
agent = Agent(
    model=ollama_model,
    tools=[calculator, current_time]
)

# Use the agent with tools
response = agent("What's the square root of 144 plus the current time?")

Troubleshooting¶

Common Issues¶

Connection Refused:
- Ensure the Ollama server is running (ollama serve or check Docker container status)
- Verify the host URL is correct
- For Docker: Check if port 11434 is properly exposed
Model Not Found:
- Pull the model first: ollama pull model_name or docker exec -it ollama ollama pull model_name
- Check for typos in the model_id
Module Not Found:
- If you encounter the error ModuleNotFoundError: No module named 'ollama', this means you haven't installed the ollama dependency in your python environment
- To fix, run pip install 'strands-agents[ollama]'