Ollama¶
Ollama is a framework for running open-source large language models locally. Strands provides native support for Ollama, allowing you to use locally-hosted models in your agents.
The OllamaModel
class in Strands enables seamless integration with Ollama's API, supporting:
- Text generation
- Image understanding
- Tool/function calling
- Streaming responses
- Configuration management
Getting Started¶
Prerequisites¶
First install the python client into your python environment:
pip install 'strands-agents[ollama]'
Next, you'll need to install and setup ollama itself.
Option 1: Native Installation¶
- Install Ollama by following the instructions at ollama.ai
- Pull your desired model:
ollama pull llama3
- Start the Ollama server:
ollama serve
Option 2: Docker Installation¶
-
Pull the Ollama Docker image:
docker pull ollama/ollama
-
Run the Ollama container:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Note: Add
--gpus=all
if you have a GPU and if Docker GPU support is configured.
-
Pull a model using the Docker container:
docker exec -it ollama ollama pull llama3
-
Verify the Ollama server is running:
curl http://localhost:11434/api/tags
Basic Usage¶
Here's how to create an agent using an Ollama model:
from strands import Agent
from strands.models.ollama import OllamaModel
# Create an Ollama model instance
ollama_model = OllamaModel(
host="http://localhost:11434", # Ollama server address
model_id="llama3" # Specify which model to use
)
# Create an agent using the Ollama model
agent = Agent(model=ollama_model)
# Use the agent
agent("Tell me about Strands agents.") # Prints model output to stdout by default
Configuration Options¶
The OllamaModel
supports various configuration parameters:
Parameter | Description | Default |
---|---|---|
host |
The address of the Ollama server | Required |
model_id |
The Ollama model identifier | Required |
keep_alive |
How long the model stays loaded in memory | "5m" |
max_tokens |
Maximum number of tokens to generate | None |
temperature |
Controls randomness (higher = more random) | None |
top_p |
Controls diversity via nucleus sampling | None |
stop_sequences |
List of sequences that stop generation | None |
options |
Additional model parameters (e.g., top_k) | None |
additional_args |
Any additional arguments for the request | None |
Example with Configuration¶
from strands import Agent
from strands.models.ollama import OllamaModel
# Create a configured Ollama model
ollama_model = OllamaModel(
host="http://localhost:11434",
model_id="llama3",
temperature=0.7,
keep_alive="10m",
stop_sequences=["###", "END"],
options={"top_k": 40}
)
# Create an agent with the configured model
agent = Agent(model=ollama_model)
# Use the agent
response = agent("Write a short story about an AI assistant.")
Advanced Features¶
Updating Configuration at Runtime¶
You can update the model configuration during runtime:
# Create the model with initial configuration
ollama_model = OllamaModel(
host="http://localhost:11434",
model_id="llama3",
temperature=0.7
)
# Update configuration later
ollama_model.update_config(
temperature=0.9,
top_p=0.8
)
This is especially useful if you want a tool to update the model's config for you:
@tool
def update_model_id(model_id: str, agent: Agent) -> str:
"""
Update the model id of the agent
Args:
model_id: Ollama model id to use.
"""
print(f"Updating model_id to {model_id}")
agent.model.update_config(model_id=model_id)
return f"Model updated to {model_id}"
@tool
def update_temperature(temperature: float, agent: Agent) -> str:
"""
Update the temperature of the agent
Args:
temperature: Temperature value for the model to use.
"""
print(f"Updating Temperature to {temperature}")
agent.model.update_config(temperature=temperature)
return f"Temperature updated to {temperature}"
Using Different Models¶
Ollama supports many different models. You can switch between them (make sure they are pulled first). See the list of available models here: https://ollama.com/search
# Create models for different use cases
creative_model = OllamaModel(
host="http://localhost:11434",
model_id="llama3",
temperature=0.8
)
factual_model = OllamaModel(
host="http://localhost:11434",
model_id="mistral",
temperature=0.2
)
# Create agents with different models
creative_agent = Agent(model=creative_model)
factual_agent = Agent(model=factual_model)
Structured Output¶
Ollama supports structured output for models that have tool calling capabilities. When you use Agent.structured_output()
, the Strands SDK converts your Pydantic models to tool specifications that compatible Ollama models can understand.
from pydantic import BaseModel, Field
from strands import Agent
from strands.models.ollama import OllamaModel
class BookAnalysis(BaseModel):
"""Analyze a book's key information."""
title: str = Field(description="The book's title")
author: str = Field(description="The book's author")
genre: str = Field(description="Primary genre or category")
summary: str = Field(description="Brief summary of the book")
rating: int = Field(description="Rating from 1-10", ge=1, le=10)
ollama_model = OllamaModel(
host="http://localhost:11434",
model_id="llama3",
)
agent = Agent(model=ollama_model)
result = agent.structured_output(
BookAnalysis,
"""
Analyze this book: "The Hitchhiker's Guide to the Galaxy" by Douglas Adams.
It's a science fiction comedy about Arthur Dent's adventures through space
after Earth is destroyed. It's widely considered a classic of humorous sci-fi.
"""
)
print(f"Title: {result.title}")
print(f"Author: {result.author}")
print(f"Genre: {result.genre}")
print(f"Rating: {result.rating}")
Tool Support¶
Ollama models that support tool use can use tools through Strands's tool system:
from strands import Agent
from strands.models.ollama import OllamaModel
from strands_tools import calculator, current_time
# Create an Ollama model
ollama_model = OllamaModel(
host="http://localhost:11434",
model_id="llama3"
)
# Create an agent with tools
agent = Agent(
model=ollama_model,
tools=[calculator, current_time]
)
# Use the agent with tools
response = agent("What's the square root of 144 plus the current time?")
Troubleshooting¶
Common Issues¶
-
Connection Refused:
- Ensure the Ollama server is running (
ollama serve
or check Docker container status) - Verify the host URL is correct
- For Docker: Check if port 11434 is properly exposed
- Ensure the Ollama server is running (
-
Model Not Found:
- Pull the model first:
ollama pull model_name
ordocker exec -it ollama ollama pull model_name
- Check for typos in the model_id
- Pull the model first:
-
Module Not Found:
- If you encounter the error
ModuleNotFoundError: No module named 'ollama'
, this means you haven't installed theollama
dependency in your python environment - To fix, run
pip install 'strands-agents[ollama]'
- If you encounter the error