`strands.experimental.bidi.models.gemini_live` ¶

Gemini Live API bidirectional model provider using official Google GenAI SDK.

Implements the BidiModel interface for Google's Gemini Live API using the official Google GenAI SDK for simplified and robust WebSocket communication.

Key improvements over custom WebSocket implementation:

Uses official google-genai SDK with native Live API support
Simplified session management with client.aio.live.connect()
Built-in tool integration and event handling
Automatic WebSocket connection management and error handling
Native support for audio/text streaming and interruption

`AudioChannel = Literal[1, 2]` `module-attribute` ¶

Number of audio channels.

Mono: 1
Stereo: 2

`AudioSampleRate = Literal[16000, 24000, 48000]` `module-attribute` ¶

Audio sample rate in Hz.

`BidiInputEvent = BidiTextInputEvent | BidiAudioInputEvent | BidiImageInputEvent` `module-attribute` ¶

Union of different bidi input event types.

`BidiOutputEvent = BidiConnectionStartEvent | BidiConnectionRestartEvent | BidiResponseStartEvent | BidiAudioStreamEvent | BidiTranscriptStreamEvent | BidiInterruptionEvent | BidiResponseCompleteEvent | BidiUsageEvent | BidiConnectionCloseEvent | BidiErrorEvent | ToolUseStreamEvent` `module-attribute` ¶

Union of different bidi output event types.

`GEMINI_CHANNELS = 1` `module-attribute` ¶

`GEMINI_INPUT_SAMPLE_RATE = 16000` `module-attribute` ¶

`GEMINI_OUTPUT_SAMPLE_RATE = 24000` `module-attribute` ¶

`Messages = list[Message]` `module-attribute` ¶

A list of messages representing a conversation.

`logger = logging.getLogger(name)` `module-attribute` ¶

`AudioConfig` ¶

Bases: TypedDict

Audio configuration for bidirectional streaming models.

Defines standard audio parameters that model providers use to specify their audio processing requirements. All fields are optional to support models that may not use audio or only need specific parameters.

Model providers build this configuration by merging user-provided values with their own defaults. The resulting configuration is then used by audio I/O implementations to configure hardware appropriately.

Attributes:

Name	Type	Description
`input_rate`	`AudioSampleRate`	Input sample rate in Hz (e.g., 16000, 24000, 48000)
`output_rate`	`AudioSampleRate`	Output sample rate in Hz (e.g., 16000, 24000, 48000)
`channels`	`AudioChannel`	Number of audio channels (1=mono, 2=stereo)
`format`	`AudioFormat`	Audio encoding format
`voice`	`str`	Voice identifier for text-to-speech (e.g., "alloy", "matthew")

Source code in strands/experimental/bidi/types/model.py

class AudioConfig(TypedDict, total=False):
    """Audio configuration for bidirectional streaming models.

    Defines standard audio parameters that model providers use to specify
    their audio processing requirements. All fields are optional to support
    models that may not use audio or only need specific parameters.

    Model providers build this configuration by merging user-provided values
    with their own defaults. The resulting configuration is then used by
    audio I/O implementations to configure hardware appropriately.

    Attributes:
        input_rate: Input sample rate in Hz (e.g., 16000, 24000, 48000)
        output_rate: Output sample rate in Hz (e.g., 16000, 24000, 48000)
        channels: Number of audio channels (1=mono, 2=stereo)
        format: Audio encoding format
        voice: Voice identifier for text-to-speech (e.g., "alloy", "matthew")
    """

    input_rate: AudioSampleRate
    output_rate: AudioSampleRate
    channels: AudioChannel
    format: AudioFormat
    voice: str

`BidiAudioInputEvent` ¶

Bases: TypedEvent

Audio input event for sending audio to the model.

Used for sending audio data through the send() method.

Parameters:

Name	Type	Description	Default
`audio`	`str`	Base64-encoded audio string to send to model.	required
`format`	`AudioFormat \| str`	Audio format from SUPPORTED_AUDIO_FORMATS.	required
`sample_rate`	`AudioSampleRate`	Sample rate from SUPPORTED_SAMPLE_RATES.	required
`channels`	`AudioChannel`	Channel count from SUPPORTED_CHANNELS.	required

Source code in strands/experimental/bidi/types/events.py

class BidiAudioInputEvent(TypedEvent):
    """Audio input event for sending audio to the model.

    Used for sending audio data through the send() method.

    Parameters:
        audio: Base64-encoded audio string to send to model.
        format: Audio format from SUPPORTED_AUDIO_FORMATS.
        sample_rate: Sample rate from SUPPORTED_SAMPLE_RATES.
        channels: Channel count from SUPPORTED_CHANNELS.
    """

    def __init__(
        self,
        audio: str,
        format: AudioFormat | str,
        sample_rate: AudioSampleRate,
        channels: AudioChannel,
    ):
        """Initialize audio input event."""
        super().__init__(
            {
                "type": "bidi_audio_input",
                "audio": audio,
                "format": format,
                "sample_rate": sample_rate,
                "channels": channels,
            }
        )

    @property
    def audio(self) -> str:
        """Base64-encoded audio string."""
        return cast(str, self["audio"])

    @property
    def format(self) -> AudioFormat:
        """Audio encoding format."""
        return cast(AudioFormat, self["format"])

    @property
    def sample_rate(self) -> AudioSampleRate:
        """Number of audio samples per second in Hz."""
        return cast(AudioSampleRate, self["sample_rate"])

    @property
    def channels(self) -> AudioChannel:
        """Number of audio channels (1=mono, 2=stereo)."""
        return cast(AudioChannel, self["channels"])

`audio` `property` ¶

Base64-encoded audio string.

`channels` `property` ¶

Number of audio channels (1=mono, 2=stereo).

`format` `property` ¶

Audio encoding format.

`sample_rate` `property` ¶

Number of audio samples per second in Hz.

`init(audio, format, sample_rate, channels)` ¶

Initialize audio input event.

Source code in strands/experimental/bidi/types/events.py

def __init__(
    self,
    audio: str,
    format: AudioFormat | str,
    sample_rate: AudioSampleRate,
    channels: AudioChannel,
):
    """Initialize audio input event."""
    super().__init__(
        {
            "type": "bidi_audio_input",
            "audio": audio,
            "format": format,
            "sample_rate": sample_rate,
            "channels": channels,
        }
    )

`BidiAudioStreamEvent` ¶

Bases: TypedEvent

Streaming audio output from the model.

Parameters:

Name	Type	Description	Default
`audio`	`str`	Base64-encoded audio string.	required
`format`	`AudioFormat`	Audio encoding format.	required
`sample_rate`	`AudioSampleRate`	Number of audio samples per second in Hz.	required
`channels`	`AudioChannel`	Number of audio channels (1=mono, 2=stereo).	required

Source code in strands/experimental/bidi/types/events.py

class BidiAudioStreamEvent(TypedEvent):
    """Streaming audio output from the model.

    Parameters:
        audio: Base64-encoded audio string.
        format: Audio encoding format.
        sample_rate: Number of audio samples per second in Hz.
        channels: Number of audio channels (1=mono, 2=stereo).
    """

    def __init__(
        self,
        audio: str,
        format: AudioFormat,
        sample_rate: AudioSampleRate,
        channels: AudioChannel,
    ):
        """Initialize audio stream event."""
        super().__init__(
            {
                "type": "bidi_audio_stream",
                "audio": audio,
                "format": format,
                "sample_rate": sample_rate,
                "channels": channels,
            }
        )

    @property
    def audio(self) -> str:
        """Base64-encoded audio string."""
        return cast(str, self["audio"])

    @property
    def format(self) -> AudioFormat:
        """Audio encoding format."""
        return cast(AudioFormat, self["format"])

    @property
    def sample_rate(self) -> AudioSampleRate:
        """Number of audio samples per second in Hz."""
        return cast(AudioSampleRate, self["sample_rate"])

    @property
    def channels(self) -> AudioChannel:
        """Number of audio channels (1=mono, 2=stereo)."""
        return cast(AudioChannel, self["channels"])

`audio` `property` ¶

Base64-encoded audio string.

`channels` `property` ¶

Number of audio channels (1=mono, 2=stereo).

`format` `property` ¶

Audio encoding format.

`sample_rate` `property` ¶

Number of audio samples per second in Hz.

`init(audio, format, sample_rate, channels)` ¶

Initialize audio stream event.

Source code in strands/experimental/bidi/types/events.py

def __init__(
    self,
    audio: str,
    format: AudioFormat,
    sample_rate: AudioSampleRate,
    channels: AudioChannel,
):
    """Initialize audio stream event."""
    super().__init__(
        {
            "type": "bidi_audio_stream",
            "audio": audio,
            "format": format,
            "sample_rate": sample_rate,
            "channels": channels,
        }
    )

`BidiConnectionStartEvent` ¶

Bases: TypedEvent

Streaming connection established and ready for interaction.

Parameters:

Name	Type	Description	Default
`connection_id`	`str`	Unique identifier for this streaming connection.	required
`model`	`str`	Model identifier (e.g., "gpt-realtime", "gemini-2.0-flash-live").	required

Source code in strands/experimental/bidi/types/events.py

class BidiConnectionStartEvent(TypedEvent):
    """Streaming connection established and ready for interaction.

    Parameters:
        connection_id: Unique identifier for this streaming connection.
        model: Model identifier (e.g., "gpt-realtime", "gemini-2.0-flash-live").
    """

    def __init__(self, connection_id: str, model: str):
        """Initialize connection start event."""
        super().__init__(
            {
                "type": "bidi_connection_start",
                "connection_id": connection_id,
                "model": model,
            }
        )

    @property
    def connection_id(self) -> str:
        """Unique identifier for this streaming connection."""
        return cast(str, self["connection_id"])

    @property
    def model(self) -> str:
        """Model identifier (e.g., 'gpt-realtime', 'gemini-2.0-flash-live')."""
        return cast(str, self["model"])

`connection_id` `property` ¶

Unique identifier for this streaming connection.

`model` `property` ¶

Model identifier (e.g., 'gpt-realtime', 'gemini-2.0-flash-live').

`init(connection_id, model)` ¶

Initialize connection start event.

Source code in strands/experimental/bidi/types/events.py

def __init__(self, connection_id: str, model: str):
    """Initialize connection start event."""
    super().__init__(
        {
            "type": "bidi_connection_start",
            "connection_id": connection_id,
            "model": model,
        }
    )

`BidiGeminiLiveModel` ¶

Bases: BidiModel

Gemini Live API implementation using official Google GenAI SDK.

Combines model configuration and connection state in a single class. Provides a clean interface to Gemini Live API using the official SDK, eliminating custom WebSocket handling and providing robust error handling.

Source code in strands/experimental/bidi/models/gemini_live.py

class BidiGeminiLiveModel(BidiModel):
    """Gemini Live API implementation using official Google GenAI SDK.

    Combines model configuration and connection state in a single class.
    Provides a clean interface to Gemini Live API using the official SDK,
    eliminating custom WebSocket handling and providing robust error handling.
    """

    def __init__(
        self,
        model_id: str = "gemini-2.5-flash-native-audio-preview-09-2025",
        provider_config: dict[str, Any] | None = None,
        client_config: dict[str, Any] | None = None,
        **kwargs: Any,
    ):
        """Initialize Gemini Live API bidirectional model.

        Args:
            model_id: Model identifier (default: gemini-2.5-flash-native-audio-preview-09-2025)
            provider_config: Model behavior (audio, inference)
            client_config: Authentication (api_key, http_options)
            **kwargs: Reserved for future parameters.

        """
        # Store model ID
        self.model_id = model_id

        # Resolve client config with defaults
        self._client_config = self._resolve_client_config(client_config or {})

        # Resolve provider config with defaults
        self.config = self._resolve_provider_config(provider_config or {})

        # Store API key for later use
        self.api_key = self._client_config.get("api_key")

        # Create Gemini client
        self._client = genai.Client(**self._client_config)

        # Connection state (initialized in start())
        self._live_session: Any = None
        self._live_session_context_manager: Any = None
        self._live_session_handle: str | None = None
        self._connection_id: str | None = None

    def _resolve_client_config(self, config: dict[str, Any]) -> dict[str, Any]:
        """Resolve client config (sets default http_options if not provided)."""
        resolved = config.copy()

        # Set default http_options if not provided
        if "http_options" not in resolved:
            resolved["http_options"] = {"api_version": "v1alpha"}

        return resolved

    def _resolve_provider_config(self, config: dict[str, Any]) -> dict[str, Any]:
        """Merge user config with defaults (user takes precedence)."""
        default_audio: AudioConfig = {
            "input_rate": GEMINI_INPUT_SAMPLE_RATE,
            "output_rate": GEMINI_OUTPUT_SAMPLE_RATE,
            "channels": GEMINI_CHANNELS,
            "format": "pcm",
        }
        default_inference = {
            "response_modalities": ["AUDIO"],
            "outputAudioTranscription": {},
            "inputAudioTranscription": {},
        }

        resolved = {
            "audio": {
                **default_audio,
                **config.get("audio", {}),
            },
            "inference": {
                **default_inference,
                **config.get("inference", {}),
            },
        }
        return resolved

    async def start(
        self,
        system_prompt: str | None = None,
        tools: list[ToolSpec] | None = None,
        messages: Messages | None = None,
        **kwargs: Any,
    ) -> None:
        """Establish bidirectional connection with Gemini Live API.

        Args:
            system_prompt: System instructions for the model.
            tools: List of tools available to the model.
            messages: Conversation history to initialize with.
            **kwargs: Additional configuration options.
        """
        if self._connection_id:
            raise RuntimeError("model already started | call stop before starting again")

        self._connection_id = str(uuid.uuid4())

        # Build live config
        live_config = self._build_live_config(system_prompt, tools, **kwargs)

        # Create the context manager and session
        self._live_session_context_manager = self._client.aio.live.connect(
            model=self.model_id, config=cast(LiveConnectConfigOrDict, live_config)
        )
        self._live_session = await self._live_session_context_manager.__aenter__()

        # Gemini itself restores message history when resuming from session
        if messages and "live_session_handle" not in kwargs:
            await self._send_message_history(messages)

    async def _send_message_history(self, messages: Messages) -> None:
        """Send conversation history to Gemini Live API.

        Sends each message as a separate turn with the correct role to maintain
        proper conversation context. Follows the same pattern as the non-bidirectional
        Gemini model implementation.
        """
        if not messages:
            return

        # Convert each message to Gemini format and send separately
        for message in messages:
            content_parts = []
            for content_block in message["content"]:
                if "text" in content_block:
                    content_parts.append(genai_types.Part(text=content_block["text"]))

            if content_parts:
                # Map role correctly - Gemini uses "user" and "model" roles
                # "assistant" role from Messages format maps to "model" in Gemini
                role = "model" if message["role"] == "assistant" else message["role"]
                content = genai_types.Content(role=role, parts=content_parts)
                await self._live_session.send_client_content(turns=content)

    async def receive(self) -> AsyncGenerator[BidiOutputEvent, None]:
        """Receive Gemini Live API events and convert to provider-agnostic format."""
        if not self._connection_id:
            raise RuntimeError("model not started | call start before receiving")

        yield BidiConnectionStartEvent(connection_id=self._connection_id, model=self.model_id)

        # Wrap in while loop to restart after turn_complete (SDK limitation workaround)
        while True:
            async for message in self._live_session.receive():
                for event in self._convert_gemini_live_event(message):
                    yield event

    def _convert_gemini_live_event(self, message: LiveServerMessage) -> list[BidiOutputEvent]:
        """Convert Gemini Live API events to provider-agnostic format.

        Handles different types of content:

        - inputTranscription: User's speech transcribed to text
        - outputTranscription: Model's audio transcribed to text
        - modelTurn text: Text response from the model
        - usageMetadata: Token usage information

        Returns:
            List of event dicts (empty list if no events to emit).

        Raises:
            BidiModelTimeoutError: If gemini responds with go away message.
        """
        if message.go_away:
            raise BidiModelTimeoutError(
                message.go_away.model_dump_json(), live_session_handle=self._live_session_handle
            )

        if message.session_resumption_update:
            resumption_update = message.session_resumption_update
            if resumption_update.resumable and resumption_update.new_handle:
                self._live_session_handle = resumption_update.new_handle
                logger.debug("session_handle=<%s> | updating gemini session handle", self._live_session_handle)
            return []

        # Handle interruption first (from server_content)
        if message.server_content and message.server_content.interrupted:
            return [BidiInterruptionEvent(reason="user_speech")]

        # Handle input transcription (user's speech) - emit as transcript event
        if message.server_content and message.server_content.input_transcription:
            input_transcript = message.server_content.input_transcription
            # Check if the transcription object has text content
            if hasattr(input_transcript, "text") and input_transcript.text:
                transcription_text = input_transcript.text
                logger.debug("text_length=<%d> | gemini input transcription detected", len(transcription_text))
                return [
                    BidiTranscriptStreamEvent(
                        delta={"text": transcription_text},
                        text=transcription_text,
                        role="user",
                        # TODO: https://github.com/googleapis/python-genai/issues/1504
                        is_final=bool(input_transcript.finished),
                        current_transcript=transcription_text,
                    )
                ]

        # Handle output transcription (model's audio) - emit as transcript event
        if message.server_content and message.server_content.output_transcription:
            output_transcript = message.server_content.output_transcription
            # Check if the transcription object has text content
            if hasattr(output_transcript, "text") and output_transcript.text:
                transcription_text = output_transcript.text
                logger.debug("text_length=<%d> | gemini output transcription detected", len(transcription_text))
                return [
                    BidiTranscriptStreamEvent(
                        delta={"text": transcription_text},
                        text=transcription_text,
                        role="assistant",
                        # TODO: https://github.com/googleapis/python-genai/issues/1504
                        is_final=bool(output_transcript.finished),
                        current_transcript=transcription_text,
                    )
                ]

        # Handle audio output using SDK's built-in data property
        # Check this BEFORE text to avoid triggering warning on mixed content
        if message.data:
            # Convert bytes to base64 string for JSON serializability
            audio_b64 = base64.b64encode(message.data).decode("utf-8")
            return [
                BidiAudioStreamEvent(
                    audio=audio_b64,
                    format="pcm",
                    sample_rate=cast(AudioSampleRate, self.config["audio"]["output_rate"]),
                    channels=cast(AudioChannel, self.config["audio"]["channels"]),
                )
            ]

        # Handle text output from model_turn (avoids warning by checking parts directly)
        if message.server_content and message.server_content.model_turn:
            model_turn = message.server_content.model_turn
            if model_turn.parts:
                # Concatenate all text parts (Gemini may send multiple parts)
                text_parts = []
                for part in model_turn.parts:
                    # Check if part has text attribute and it's not empty
                    if hasattr(part, "text") and part.text:
                        text_parts.append(part.text)

                if text_parts:
                    full_text = " ".join(text_parts)
                    return [
                        BidiTranscriptStreamEvent(
                            delta={"text": full_text},
                            text=full_text,
                            role="assistant",
                            is_final=True,
                            current_transcript=full_text,
                        )
                    ]

        # Handle tool calls - return list to support multiple tool calls
        if message.tool_call and message.tool_call.function_calls:
            tool_events: list[BidiOutputEvent] = []
            for func_call in message.tool_call.function_calls:
                tool_use_event: ToolUse = {
                    "toolUseId": cast(str, func_call.id),
                    "name": cast(str, func_call.name),
                    "input": func_call.args or {},
                }
                # Create ToolUseStreamEvent for consistency with standard agent
                tool_events.append(
                    ToolUseStreamEvent(delta={"toolUse": tool_use_event}, current_tool_use=dict(tool_use_event))
                )
            return tool_events

        # Handle usage metadata
        if hasattr(message, "usage_metadata") and message.usage_metadata:
            usage = message.usage_metadata

            # Build modality details from token details
            modality_details = []

            # Process prompt tokens details
            if usage.prompt_tokens_details:
                for detail in usage.prompt_tokens_details:
                    if detail.modality and detail.token_count:
                        modality_details.append(
                            {
                                "modality": str(detail.modality).lower(),
                                "input_tokens": detail.token_count,
                                "output_tokens": 0,
                            }
                        )

            # Process response tokens details
            if usage.response_tokens_details:
                for detail in usage.response_tokens_details:
                    if detail.modality and detail.token_count:
                        # Find or create modality entry
                        modality_str = str(detail.modality).lower()
                        existing = next((m for m in modality_details if m["modality"] == modality_str), None)
                        if existing:
                            existing["output_tokens"] = detail.token_count
                        else:
                            modality_details.append(
                                {"modality": modality_str, "input_tokens": 0, "output_tokens": detail.token_count}
                            )

            return [
                BidiUsageEvent(
                    input_tokens=usage.prompt_token_count or 0,
                    output_tokens=usage.response_token_count or 0,
                    total_tokens=usage.total_token_count or 0,
                    modality_details=cast(list[ModalityUsage], modality_details) if modality_details else None,
                    cache_read_input_tokens=usage.cached_content_token_count
                    if usage.cached_content_token_count
                    else None,
                )
            ]

        # Silently ignore setup_complete and generation_complete messages
        return []

    async def send(
        self,
        content: BidiInputEvent | ToolResultEvent,
    ) -> None:
        """Unified send method for all content types. Sends the given inputs to Google Live API.

        Dispatches to appropriate internal handler based on content type.

        Args:
            content: Typed event (BidiTextInputEvent, BidiAudioInputEvent, BidiImageInputEvent, or ToolResultEvent).

        Raises:
            ValueError: If content type not supported (e.g., image content).
        """
        if not self._connection_id:
            raise RuntimeError("model not started | call start before sending/receiving")

        if isinstance(content, BidiTextInputEvent):
            await self._send_text_content(content.text)
        elif isinstance(content, BidiAudioInputEvent):
            await self._send_audio_content(content)
        elif isinstance(content, BidiImageInputEvent):
            await self._send_image_content(content)
        elif isinstance(content, ToolResultEvent):
            tool_result = content.get("tool_result")
            if tool_result:
                await self._send_tool_result(tool_result)
        else:
            raise ValueError(f"content_type={type(content)} | content not supported")

    async def _send_audio_content(self, audio_input: BidiAudioInputEvent) -> None:
        """Internal: Send audio content using Gemini Live API.

        Gemini Live expects continuous audio streaming via send_realtime_input.
        This automatically triggers VAD and can interrupt ongoing responses.
        """
        # Decode base64 audio to bytes for SDK
        audio_bytes = base64.b64decode(audio_input.audio)

        # Create audio blob for the SDK
        mime_type = f"audio/pcm;rate={self.config['audio']['input_rate']}"
        audio_blob = genai_types.Blob(data=audio_bytes, mime_type=mime_type)

        # Send real-time audio input - this automatically handles VAD and interruption
        await self._live_session.send_realtime_input(audio=audio_blob)

    async def _send_image_content(self, image_input: BidiImageInputEvent) -> None:
        """Internal: Send image content using Gemini Live API.

        Sends image frames following the same pattern as the GitHub example.
        Images are sent as base64-encoded data with MIME type.
        """
        # Image is already base64 encoded in the event
        msg = {"mime_type": image_input.mime_type, "data": image_input.image}

        # Send using the same method as the GitHub example
        await self._live_session.send(input=msg)

    async def _send_text_content(self, text: str) -> None:
        """Internal: Send text content using Gemini Live API."""
        # Create content with text
        content = genai_types.Content(role="user", parts=[genai_types.Part(text=text)])

        # Send as client content
        await self._live_session.send_client_content(turns=content)

    async def _send_tool_result(self, tool_result: ToolResult) -> None:
        """Internal: Send tool result using Gemini Live API."""
        tool_use_id = tool_result.get("toolUseId")
        content = tool_result.get("content", [])

        # Validate all content types are supported
        for block in content:
            if "text" not in block and "json" not in block:
                # Unsupported content type - raise error
                raise ValueError(
                    f"tool_use_id=<{tool_use_id}>, content_types=<{list(block.keys())}> | "
                    f"Content type not supported by Gemini Live API"
                )

        # Optimize for single content item - unwrap the array
        if len(content) == 1:
            result_data = cast(dict[str, Any], content[0])
        else:
            # Multiple items - send as array
            result_data = {"result": content}

        # Create function response
        func_response = genai_types.FunctionResponse(
            id=tool_use_id,
            name=tool_use_id,  # Gemini uses name as identifier
            response=result_data,
        )

        # Send tool response
        await self._live_session.send_tool_response(function_responses=[func_response])

    async def stop(self) -> None:
        """Close Gemini Live API connection."""

        async def stop_session() -> None:
            if not self._live_session_context_manager:
                return

            await self._live_session_context_manager.__aexit__(None, None, None)

        async def stop_connection() -> None:
            self._connection_id = None

        await stop_all(stop_session, stop_connection)

    def _build_live_config(
        self, system_prompt: str | None = None, tools: list[ToolSpec] | None = None, **kwargs: Any
    ) -> dict[str, Any]:
        """Build LiveConnectConfig for the official SDK.

        Simply passes through all config parameters from provider_config, allowing users
        to configure any Gemini Live API parameter directly.
        """
        config_dict: dict[str, Any] = self.config["inference"].copy()

        config_dict["session_resumption"] = {"handle": kwargs.get("live_session_handle")}

        # Add system instruction if provided
        if system_prompt:
            config_dict["system_instruction"] = system_prompt

        # Add tools if provided
        if tools:
            config_dict["tools"] = self._format_tools_for_live_api(tools)

        if "voice" in self.config["audio"]:
            config_dict.setdefault("speech_config", {}).setdefault("voice_config", {}).setdefault(
                "prebuilt_voice_config", {}
            )["voice_name"] = self.config["audio"]["voice"]

        return config_dict

    def _format_tools_for_live_api(self, tool_specs: list[ToolSpec]) -> list[genai_types.Tool]:
        """Format tool specs for Gemini Live API."""
        if not tool_specs:
            return []

        return [
            genai_types.Tool(
                function_declarations=[
                    genai_types.FunctionDeclaration(
                        description=tool_spec["description"],
                        name=tool_spec["name"],
                        parameters_json_schema=tool_spec["inputSchema"]["json"],
                    )
                    for tool_spec in tool_specs
                ],
            ),
        ]

`init(model_id='gemini-2.5-flash-native-audio-preview-09-2025', provider_config=None, client_config=None, **kwargs)` ¶

Initialize Gemini Live API bidirectional model.

Parameters:

Name	Type	Description	Default
`model_id`	`str`	Model identifier (default: gemini-2.5-flash-native-audio-preview-09-2025)	`'gemini-2.5-flash-native-audio-preview-09-2025'`
`provider_config`	`dict[str, Any] \| None`	Model behavior (audio, inference)	`None`
`client_config`	`dict[str, Any] \| None`	Authentication (api_key, http_options)	`None`
`**kwargs`	`Any`	Reserved for future parameters.	`{}`

Source code in strands/experimental/bidi/models/gemini_live.py

def __init__(
    self,
    model_id: str = "gemini-2.5-flash-native-audio-preview-09-2025",
    provider_config: dict[str, Any] | None = None,
    client_config: dict[str, Any] | None = None,
    **kwargs: Any,
):
    """Initialize Gemini Live API bidirectional model.

    Args:
        model_id: Model identifier (default: gemini-2.5-flash-native-audio-preview-09-2025)
        provider_config: Model behavior (audio, inference)
        client_config: Authentication (api_key, http_options)
        **kwargs: Reserved for future parameters.

    """
    # Store model ID
    self.model_id = model_id

    # Resolve client config with defaults
    self._client_config = self._resolve_client_config(client_config or {})

    # Resolve provider config with defaults
    self.config = self._resolve_provider_config(provider_config or {})

    # Store API key for later use
    self.api_key = self._client_config.get("api_key")

    # Create Gemini client
    self._client = genai.Client(**self._client_config)

    # Connection state (initialized in start())
    self._live_session: Any = None
    self._live_session_context_manager: Any = None
    self._live_session_handle: str | None = None
    self._connection_id: str | None = None

`receive()` `async` ¶

Receive Gemini Live API events and convert to provider-agnostic format.

Source code in strands/experimental/bidi/models/gemini_live.py

async def receive(self) -> AsyncGenerator[BidiOutputEvent, None]:
    """Receive Gemini Live API events and convert to provider-agnostic format."""
    if not self._connection_id:
        raise RuntimeError("model not started | call start before receiving")

    yield BidiConnectionStartEvent(connection_id=self._connection_id, model=self.model_id)

    # Wrap in while loop to restart after turn_complete (SDK limitation workaround)
    while True:
        async for message in self._live_session.receive():
            for event in self._convert_gemini_live_event(message):
                yield event

`send(content)` `async` ¶

Unified send method for all content types. Sends the given inputs to Google Live API.

Dispatches to appropriate internal handler based on content type.

Parameters:

Name	Type	Description	Default
`content`	`BidiInputEvent \| ToolResultEvent`	Typed event (BidiTextInputEvent, BidiAudioInputEvent, BidiImageInputEvent, or ToolResultEvent).	required

Raises:

Type	Description
`ValueError`	If content type not supported (e.g., image content).

Source code in strands/experimental/bidi/models/gemini_live.py

async def send(
    self,
    content: BidiInputEvent | ToolResultEvent,
) -> None:
    """Unified send method for all content types. Sends the given inputs to Google Live API.

    Dispatches to appropriate internal handler based on content type.

    Args:
        content: Typed event (BidiTextInputEvent, BidiAudioInputEvent, BidiImageInputEvent, or ToolResultEvent).

    Raises:
        ValueError: If content type not supported (e.g., image content).
    """
    if not self._connection_id:
        raise RuntimeError("model not started | call start before sending/receiving")

    if isinstance(content, BidiTextInputEvent):
        await self._send_text_content(content.text)
    elif isinstance(content, BidiAudioInputEvent):
        await self._send_audio_content(content)
    elif isinstance(content, BidiImageInputEvent):
        await self._send_image_content(content)
    elif isinstance(content, ToolResultEvent):
        tool_result = content.get("tool_result")
        if tool_result:
            await self._send_tool_result(tool_result)
    else:
        raise ValueError(f"content_type={type(content)} | content not supported")

`start(system_prompt=None, tools=None, messages=None, **kwargs)` `async` ¶

Establish bidirectional connection with Gemini Live API.

Parameters:

Name	Type	Description	Default
`system_prompt`	`str \| None`	System instructions for the model.	`None`
`tools`	`list[ToolSpec] \| None`	List of tools available to the model.	`None`
`messages`	`Messages \| None`	Conversation history to initialize with.	`None`
`**kwargs`	`Any`	Additional configuration options.	`{}`

Source code in strands/experimental/bidi/models/gemini_live.py

async def start(
    self,
    system_prompt: str | None = None,
    tools: list[ToolSpec] | None = None,
    messages: Messages | None = None,
    **kwargs: Any,
) -> None:
    """Establish bidirectional connection with Gemini Live API.

    Args:
        system_prompt: System instructions for the model.
        tools: List of tools available to the model.
        messages: Conversation history to initialize with.
        **kwargs: Additional configuration options.
    """
    if self._connection_id:
        raise RuntimeError("model already started | call stop before starting again")

    self._connection_id = str(uuid.uuid4())

    # Build live config
    live_config = self._build_live_config(system_prompt, tools, **kwargs)

    # Create the context manager and session
    self._live_session_context_manager = self._client.aio.live.connect(
        model=self.model_id, config=cast(LiveConnectConfigOrDict, live_config)
    )
    self._live_session = await self._live_session_context_manager.__aenter__()

    # Gemini itself restores message history when resuming from session
    if messages and "live_session_handle" not in kwargs:
        await self._send_message_history(messages)

`stop()` `async` ¶

Close Gemini Live API connection.

Source code in strands/experimental/bidi/models/gemini_live.py

async def stop(self) -> None:
    """Close Gemini Live API connection."""

    async def stop_session() -> None:
        if not self._live_session_context_manager:
            return

        await self._live_session_context_manager.__aexit__(None, None, None)

    async def stop_connection() -> None:
        self._connection_id = None

    await stop_all(stop_session, stop_connection)

`BidiImageInputEvent` ¶

Bases: TypedEvent

Image input event for sending images/video frames to the model.

Used for sending image data through the send() method.

Parameters:

Name	Type	Description	Default
`image`	`str`	Base64-encoded image string.	required
`mime_type`	`str`	MIME type (e.g., "image/jpeg", "image/png").	required

Source code in strands/experimental/bidi/types/events.py

class BidiImageInputEvent(TypedEvent):
    """Image input event for sending images/video frames to the model.

    Used for sending image data through the send() method.

    Parameters:
        image: Base64-encoded image string.
        mime_type: MIME type (e.g., "image/jpeg", "image/png").
    """

    def __init__(
        self,
        image: str,
        mime_type: str,
    ):
        """Initialize image input event."""
        super().__init__(
            {
                "type": "bidi_image_input",
                "image": image,
                "mime_type": mime_type,
            }
        )

    @property
    def image(self) -> str:
        """Base64-encoded image string."""
        return cast(str, self["image"])

    @property
    def mime_type(self) -> str:
        """MIME type of the image (e.g., "image/jpeg", "image/png")."""
        return cast(str, self["mime_type"])

`image` `property` ¶

Base64-encoded image string.

`mime_type` `property` ¶

MIME type of the image (e.g., "image/jpeg", "image/png").

`init(image, mime_type)` ¶

Initialize image input event.

Source code in strands/experimental/bidi/types/events.py

def __init__(
    self,
    image: str,
    mime_type: str,
):
    """Initialize image input event."""
    super().__init__(
        {
            "type": "bidi_image_input",
            "image": image,
            "mime_type": mime_type,
        }
    )

`BidiInterruptionEvent` ¶

Bases: TypedEvent

Model generation was interrupted.

Parameters:

Name	Type	Description	Default
`reason`	`Literal['user_speech', 'error']`	Why the interruption occurred.	required

Source code in strands/experimental/bidi/types/events.py

class BidiInterruptionEvent(TypedEvent):
    """Model generation was interrupted.

    Parameters:
        reason: Why the interruption occurred.
    """

    def __init__(self, reason: Literal["user_speech", "error"]):
        """Initialize interruption event."""
        super().__init__(
            {
                "type": "bidi_interruption",
                "reason": reason,
            }
        )

    @property
    def reason(self) -> str:
        """Why the interruption occurred."""
        return cast(str, self["reason"])

`reason` `property` ¶

Why the interruption occurred.

`init(reason)` ¶

Initialize interruption event.

Source code in strands/experimental/bidi/types/events.py

def __init__(self, reason: Literal["user_speech", "error"]):
    """Initialize interruption event."""
    super().__init__(
        {
            "type": "bidi_interruption",
            "reason": reason,
        }
    )

`BidiModel` ¶

Bases: Protocol

Protocol for bidirectional streaming models.

This interface defines the contract for models that support persistent streaming connections with real-time audio and text communication. Implementations handle provider-specific protocols while exposing a standardized event-based API.

Attributes:

Name	Type	Description
`config`	`dict[str, Any]`	Configuration dictionary with provider-specific settings.

Source code in strands/experimental/bidi/models/model.py

@runtime_checkable
class BidiModel(Protocol):
    """Protocol for bidirectional streaming models.

    This interface defines the contract for models that support persistent streaming
    connections with real-time audio and text communication. Implementations handle
    provider-specific protocols while exposing a standardized event-based API.

    Attributes:
        config: Configuration dictionary with provider-specific settings.
    """

    config: dict[str, Any]

    async def start(
        self,
        system_prompt: str | None = None,
        tools: list[ToolSpec] | None = None,
        messages: Messages | None = None,
        **kwargs: Any,
    ) -> None:
        """Establish a persistent streaming connection with the model.

        Opens a bidirectional connection that remains active for real-time communication.
        The connection supports concurrent sending and receiving of events until explicitly
        closed. Must be called before any send() or receive() operations.

        Args:
            system_prompt: System instructions to configure model behavior.
            tools: Tool specifications that the model can invoke during the conversation.
            messages: Initial conversation history to provide context.
            **kwargs: Provider-specific configuration options.
        """
        ...

    async def stop(self) -> None:
        """Close the streaming connection and release resources.

        Terminates the active bidirectional connection and cleans up any associated
        resources such as network connections, buffers, or background tasks. After
        calling close(), the model instance cannot be used until start() is called again.
        """
        ...

    def receive(self) -> AsyncIterable[BidiOutputEvent]:
        """Receive streaming events from the model.

        Continuously yields events from the model as they arrive over the connection.
        Events are normalized to a provider-agnostic format for uniform processing.
        This method should be called in a loop or async task to process model responses.

        The stream continues until the connection is closed or an error occurs.

        Yields:
            BidiOutputEvent: Standardized event objects containing audio output,
                transcripts, tool calls, or control signals.
        """
        ...

    async def send(
        self,
        content: BidiInputEvent | ToolResultEvent,
    ) -> None:
        """Send content to the model over the active connection.

        Transmits user input or tool results to the model during an active streaming
        session. Supports multiple content types including text, audio, images, and
        tool execution results. Can be called multiple times during a conversation.

        Args:
            content: The content to send. Must be one of:

                - BidiTextInputEvent: Text message from the user
                - BidiAudioInputEvent: Audio data for speech input
                - BidiImageInputEvent: Image data for visual understanding
                - ToolResultEvent: Result from a tool execution

        Example:
            ```
            await model.send(BidiTextInputEvent(text="Hello", role="user"))
            await model.send(BidiAudioInputEvent(audio=bytes, format="pcm", sample_rate=16000, channels=1))
            await model.send(BidiImageInputEvent(image=bytes, mime_type="image/jpeg", encoding="raw"))
            await model.send(ToolResultEvent(tool_result))
            ```
        """
        ...

`receive()` ¶

Receive streaming events from the model.

Continuously yields events from the model as they arrive over the connection. Events are normalized to a provider-agnostic format for uniform processing. This method should be called in a loop or async task to process model responses.

The stream continues until the connection is closed or an error occurs.

Yields:

Name	Type	Description
`BidiOutputEvent`	`AsyncIterable[BidiOutputEvent]`	Standardized event objects containing audio output, transcripts, tool calls, or control signals.

Source code in strands/experimental/bidi/models/model.py

def receive(self) -> AsyncIterable[BidiOutputEvent]:
    """Receive streaming events from the model.

    Continuously yields events from the model as they arrive over the connection.
    Events are normalized to a provider-agnostic format for uniform processing.
    This method should be called in a loop or async task to process model responses.

    The stream continues until the connection is closed or an error occurs.

    Yields:
        BidiOutputEvent: Standardized event objects containing audio output,
            transcripts, tool calls, or control signals.
    """
    ...

`send(content)` `async` ¶

Send content to the model over the active connection.

Transmits user input or tool results to the model during an active streaming session. Supports multiple content types including text, audio, images, and tool execution results. Can be called multiple times during a conversation.

Parameters:

Name	Type	Description	Default
`content`	`BidiInputEvent \| ToolResultEvent`	The content to send. Must be one of: BidiTextInputEvent: Text message from the user BidiAudioInputEvent: Audio data for speech input BidiImageInputEvent: Image data for visual understanding ToolResultEvent: Result from a tool execution	required

Example

await model.send(BidiTextInputEvent(text="Hello", role="user"))
await model.send(BidiAudioInputEvent(audio=bytes, format="pcm", sample_rate=16000, channels=1))
await model.send(BidiImageInputEvent(image=bytes, mime_type="image/jpeg", encoding="raw"))
await model.send(ToolResultEvent(tool_result))

Source code in strands/experimental/bidi/models/model.py

async def send(
    self,
    content: BidiInputEvent | ToolResultEvent,
) -> None:
    """Send content to the model over the active connection.

    Transmits user input or tool results to the model during an active streaming
    session. Supports multiple content types including text, audio, images, and
    tool execution results. Can be called multiple times during a conversation.

    Args:
        content: The content to send. Must be one of:

            - BidiTextInputEvent: Text message from the user
            - BidiAudioInputEvent: Audio data for speech input
            - BidiImageInputEvent: Image data for visual understanding
            - ToolResultEvent: Result from a tool execution

    Example:
        ```
        await model.send(BidiTextInputEvent(text="Hello", role="user"))
        await model.send(BidiAudioInputEvent(audio=bytes, format="pcm", sample_rate=16000, channels=1))
        await model.send(BidiImageInputEvent(image=bytes, mime_type="image/jpeg", encoding="raw"))
        await model.send(ToolResultEvent(tool_result))
        ```
    """
    ...

`start(system_prompt=None, tools=None, messages=None, **kwargs)` `async` ¶

Establish a persistent streaming connection with the model.

Opens a bidirectional connection that remains active for real-time communication. The connection supports concurrent sending and receiving of events until explicitly closed. Must be called before any send() or receive() operations.

Parameters:

Name	Type	Description	Default
`system_prompt`	`str \| None`	System instructions to configure model behavior.	`None`
`tools`	`list[ToolSpec] \| None`	Tool specifications that the model can invoke during the conversation.	`None`
`messages`	`Messages \| None`	Initial conversation history to provide context.	`None`
`**kwargs`	`Any`	Provider-specific configuration options.	`{}`

Source code in strands/experimental/bidi/models/model.py

async def start(
    self,
    system_prompt: str | None = None,
    tools: list[ToolSpec] | None = None,
    messages: Messages | None = None,
    **kwargs: Any,
) -> None:
    """Establish a persistent streaming connection with the model.

    Opens a bidirectional connection that remains active for real-time communication.
    The connection supports concurrent sending and receiving of events until explicitly
    closed. Must be called before any send() or receive() operations.

    Args:
        system_prompt: System instructions to configure model behavior.
        tools: Tool specifications that the model can invoke during the conversation.
        messages: Initial conversation history to provide context.
        **kwargs: Provider-specific configuration options.
    """
    ...

`stop()` `async` ¶

Close the streaming connection and release resources.

Terminates the active bidirectional connection and cleans up any associated resources such as network connections, buffers, or background tasks. After calling close(), the model instance cannot be used until start() is called again.

Source code in strands/experimental/bidi/models/model.py

async def stop(self) -> None:
    """Close the streaming connection and release resources.

    Terminates the active bidirectional connection and cleans up any associated
    resources such as network connections, buffers, or background tasks. After
    calling close(), the model instance cannot be used until start() is called again.
    """
    ...

`BidiModelTimeoutError` ¶

Bases: Exception

Model timeout error.

Bidirectional models are often configured with a connection time limit. Nova sonic for example keeps the connection open for 8 minutes max. Upon receiving a timeout, the agent loop is configured to restart the model connection so as to create a seamless, uninterrupted experience for the user.

Source code in strands/experimental/bidi/models/model.py

class BidiModelTimeoutError(Exception):
    """Model timeout error.

    Bidirectional models are often configured with a connection time limit. Nova sonic for example keeps the connection
    open for 8 minutes max. Upon receiving a timeout, the agent loop is configured to restart the model connection so as
    to create a seamless, uninterrupted experience for the user.
    """

    def __init__(self, message: str, **restart_config: Any) -> None:
        """Initialize error.

        Args:
            message: Timeout message from model.
            **restart_config: Configure restart specific behaviors in the call to model start.
        """
        super().__init__(self, message)

        self.restart_config = restart_config

`init(message, **restart_config)` ¶

Initialize error.

Parameters:

Name	Type	Description	Default
`message`	`str`	Timeout message from model.	required
`**restart_config`	`Any`	Configure restart specific behaviors in the call to model start.	`{}`

Source code in strands/experimental/bidi/models/model.py

def __init__(self, message: str, **restart_config: Any) -> None:
    """Initialize error.

    Args:
        message: Timeout message from model.
        **restart_config: Configure restart specific behaviors in the call to model start.
    """
    super().__init__(self, message)

    self.restart_config = restart_config

`BidiTextInputEvent` ¶

Bases: TypedEvent

Text input event for sending text to the model.

Used for sending text content through the send() method.

Parameters:

Name	Type	Description	Default
`text`	`str`	The text content to send to the model.	required
`role`	`Role`	The role of the message sender (default: "user").	`'user'`

Source code in strands/experimental/bidi/types/events.py

class BidiTextInputEvent(TypedEvent):
    """Text input event for sending text to the model.

    Used for sending text content through the send() method.

    Parameters:
        text: The text content to send to the model.
        role: The role of the message sender (default: "user").
    """

    def __init__(self, text: str, role: Role = "user"):
        """Initialize text input event."""
        super().__init__(
            {
                "type": "bidi_text_input",
                "text": text,
                "role": role,
            }
        )

    @property
    def text(self) -> str:
        """The text content to send to the model."""
        return cast(str, self["text"])

    @property
    def role(self) -> Role:
        """The role of the message sender."""
        return cast(Role, self["role"])

`role` `property` ¶

The role of the message sender.

`text` `property` ¶

The text content to send to the model.

`init(text, role='user')` ¶

Initialize text input event.

Source code in strands/experimental/bidi/types/events.py

def __init__(self, text: str, role: Role = "user"):
    """Initialize text input event."""
    super().__init__(
        {
            "type": "bidi_text_input",
            "text": text,
            "role": role,
        }
    )

`BidiTranscriptStreamEvent` ¶

Bases: ModelStreamEvent

Audio transcription streaming (user or assistant speech).

Supports incremental transcript updates for providers that send partial transcripts before the final version.

Parameters:

Name	Type	Description	Default
`delta`	`ContentBlockDelta`	The incremental transcript change (ContentBlockDelta).	required
`text`	`str`	The delta text (same as delta content for convenience).	required
`role`	`Role`	Who is speaking ("user" or "assistant").	required
`is_final`	`bool`	Whether this is the final/complete transcript.	required
`current_transcript`	`str \| None`	The accumulated transcript text so far (None for first delta).	`None`

Source code in strands/experimental/bidi/types/events.py

class BidiTranscriptStreamEvent(ModelStreamEvent):
    """Audio transcription streaming (user or assistant speech).

    Supports incremental transcript updates for providers that send partial
    transcripts before the final version.

    Parameters:
        delta: The incremental transcript change (ContentBlockDelta).
        text: The delta text (same as delta content for convenience).
        role: Who is speaking ("user" or "assistant").
        is_final: Whether this is the final/complete transcript.
        current_transcript: The accumulated transcript text so far (None for first delta).
    """

    def __init__(
        self,
        delta: ContentBlockDelta,
        text: str,
        role: Role,
        is_final: bool,
        current_transcript: str | None = None,
    ):
        """Initialize transcript stream event."""
        super().__init__(
            {
                "type": "bidi_transcript_stream",
                "delta": delta,
                "text": text,
                "role": role,
                "is_final": is_final,
                "current_transcript": current_transcript,
            }
        )

    @property
    def delta(self) -> ContentBlockDelta:
        """The incremental transcript change."""
        return cast(ContentBlockDelta, self["delta"])

    @property
    def text(self) -> str:
        """The text content to send to the model."""
        return cast(str, self["text"])

    @property
    def role(self) -> Role:
        """The role of the message sender."""
        return cast(Role, self["role"])

    @property
    def is_final(self) -> bool:
        """Whether this is the final/complete transcript."""
        return cast(bool, self["is_final"])

    @property
    def current_transcript(self) -> str | None:
        """The accumulated transcript text so far."""
        return cast(str | None, self.get("current_transcript"))

`current_transcript` `property` ¶

The accumulated transcript text so far.

`delta` `property` ¶

The incremental transcript change.

`is_final` `property` ¶

Whether this is the final/complete transcript.

`role` `property` ¶

The role of the message sender.

`text` `property` ¶

The text content to send to the model.

`init(delta, text, role, is_final, current_transcript=None)` ¶

Initialize transcript stream event.

Source code in strands/experimental/bidi/types/events.py

def __init__(
    self,
    delta: ContentBlockDelta,
    text: str,
    role: Role,
    is_final: bool,
    current_transcript: str | None = None,
):
    """Initialize transcript stream event."""
    super().__init__(
        {
            "type": "bidi_transcript_stream",
            "delta": delta,
            "text": text,
            "role": role,
            "is_final": is_final,
            "current_transcript": current_transcript,
        }
    )

`BidiUsageEvent` ¶

Bases: TypedEvent

Token usage event with modality breakdown for bidirectional streaming.

Tracks token consumption across different modalities (audio, text, images) during bidirectional streaming sessions.

Parameters:

Name	Type	Description	Default
`input_tokens`	`int`	Total tokens used for all input modalities.	required
`output_tokens`	`int`	Total tokens used for all output modalities.	required
`total_tokens`	`int`	Sum of input and output tokens.	required
`modality_details`	`list[ModalityUsage] \| None`	Optional list of token usage per modality.	`None`
`cache_read_input_tokens`	`int \| None`	Optional tokens read from cache.	`None`
`cache_write_input_tokens`	`int \| None`	Optional tokens written to cache.	`None`

Source code in strands/experimental/bidi/types/events.py

class BidiUsageEvent(TypedEvent):
    """Token usage event with modality breakdown for bidirectional streaming.

    Tracks token consumption across different modalities (audio, text, images)
    during bidirectional streaming sessions.

    Parameters:
        input_tokens: Total tokens used for all input modalities.
        output_tokens: Total tokens used for all output modalities.
        total_tokens: Sum of input and output tokens.
        modality_details: Optional list of token usage per modality.
        cache_read_input_tokens: Optional tokens read from cache.
        cache_write_input_tokens: Optional tokens written to cache.
    """

    def __init__(
        self,
        input_tokens: int,
        output_tokens: int,
        total_tokens: int,
        modality_details: list[ModalityUsage] | None = None,
        cache_read_input_tokens: int | None = None,
        cache_write_input_tokens: int | None = None,
    ):
        """Initialize usage event."""
        data: dict[str, Any] = {
            "type": "bidi_usage",
            "inputTokens": input_tokens,
            "outputTokens": output_tokens,
            "totalTokens": total_tokens,
        }
        if modality_details is not None:
            data["modality_details"] = modality_details
        if cache_read_input_tokens is not None:
            data["cacheReadInputTokens"] = cache_read_input_tokens
        if cache_write_input_tokens is not None:
            data["cacheWriteInputTokens"] = cache_write_input_tokens
        super().__init__(data)

    @property
    def input_tokens(self) -> int:
        """Total tokens used for all input modalities."""
        return cast(int, self["inputTokens"])

    @property
    def output_tokens(self) -> int:
        """Total tokens used for all output modalities."""
        return cast(int, self["outputTokens"])

    @property
    def total_tokens(self) -> int:
        """Sum of input and output tokens."""
        return cast(int, self["totalTokens"])

    @property
    def modality_details(self) -> list[ModalityUsage]:
        """Optional list of token usage per modality."""
        return cast(list[ModalityUsage], self.get("modality_details", []))

    @property
    def cache_read_input_tokens(self) -> int | None:
        """Optional tokens read from cache."""
        return cast(int | None, self.get("cacheReadInputTokens"))

    @property
    def cache_write_input_tokens(self) -> int | None:
        """Optional tokens written to cache."""
        return cast(int | None, self.get("cacheWriteInputTokens"))

`cache_read_input_tokens` `property` ¶

Optional tokens read from cache.

`cache_write_input_tokens` `property` ¶

Optional tokens written to cache.

`input_tokens` `property` ¶

Total tokens used for all input modalities.

`modality_details` `property` ¶

Optional list of token usage per modality.

`output_tokens` `property` ¶

Total tokens used for all output modalities.

`total_tokens` `property` ¶

Sum of input and output tokens.

`init(input_tokens, output_tokens, total_tokens, modality_details=None, cache_read_input_tokens=None, cache_write_input_tokens=None)` ¶

Initialize usage event.

Source code in strands/experimental/bidi/types/events.py

def __init__(
    self,
    input_tokens: int,
    output_tokens: int,
    total_tokens: int,
    modality_details: list[ModalityUsage] | None = None,
    cache_read_input_tokens: int | None = None,
    cache_write_input_tokens: int | None = None,
):
    """Initialize usage event."""
    data: dict[str, Any] = {
        "type": "bidi_usage",
        "inputTokens": input_tokens,
        "outputTokens": output_tokens,
        "totalTokens": total_tokens,
    }
    if modality_details is not None:
        data["modality_details"] = modality_details
    if cache_read_input_tokens is not None:
        data["cacheReadInputTokens"] = cache_read_input_tokens
    if cache_write_input_tokens is not None:
        data["cacheWriteInputTokens"] = cache_write_input_tokens
    super().__init__(data)

`ModalityUsage` ¶

Bases: dict

Token usage for a specific modality.

Attributes:

Name	Type	Description
`modality`	`Literal['text', 'audio', 'image', 'cached']`	Type of content.
`input_tokens`	`int`	Tokens used for this modality's input.
`output_tokens`	`int`	Tokens used for this modality's output.

Source code in strands/experimental/bidi/types/events.py

class ModalityUsage(dict):
    """Token usage for a specific modality.

    Attributes:
        modality: Type of content.
        input_tokens: Tokens used for this modality's input.
        output_tokens: Tokens used for this modality's output.
    """

    modality: Literal["text", "audio", "image", "cached"]
    input_tokens: int
    output_tokens: int

`ToolResult` ¶

Bases: TypedDict

Result of a tool execution.

Attributes:

Name	Type	Description
`content`	`list[ToolResultContent]`	List of result content returned by the tool.
`status`	`ToolResultStatus`	The status of the tool execution ("success" or "error").
`toolUseId`	`str`	The unique identifier of the tool use request that produced this result.

Source code in strands/types/tools.py

class ToolResult(TypedDict):
    """Result of a tool execution.

    Attributes:
        content: List of result content returned by the tool.
        status: The status of the tool execution ("success" or "error").
        toolUseId: The unique identifier of the tool use request that produced this result.
    """

    content: list[ToolResultContent]
    status: ToolResultStatus
    toolUseId: str

`ToolResultEvent` ¶

Bases: TypedEvent

Event emitted when a tool execution completes.

Source code in strands/types/_events.py

class ToolResultEvent(TypedEvent):
    """Event emitted when a tool execution completes."""

    def __init__(self, tool_result: ToolResult, exception: Exception | None = None) -> None:
        """Initialize tool result event."""
        super().__init__({"type": "tool_result", "tool_result": tool_result})
        self._exception = exception

    @property
    def exception(self) -> Exception | None:
        """The original exception that occurred, if any.

        Can be used for re-raising or type-based error handling.
        """
        return self._exception

    @property
    def tool_use_id(self) -> str:
        """The toolUseId associated with this result."""
        return cast(ToolResult, self.get("tool_result"))["toolUseId"]

    @property
    def tool_result(self) -> ToolResult:
        """Final result from the completed tool execution."""
        return cast(ToolResult, self.get("tool_result"))

    @property
    @override
    def is_callback_event(self) -> bool:
        return False

`exception` `property` ¶

The original exception that occurred, if any.

Can be used for re-raising or type-based error handling.

`tool_result` `property` ¶

Final result from the completed tool execution.

`tool_use_id` `property` ¶

The toolUseId associated with this result.

`init(tool_result, exception=None)` ¶

Initialize tool result event.

Source code in strands/types/_events.py

def __init__(self, tool_result: ToolResult, exception: Exception | None = None) -> None:
    """Initialize tool result event."""
    super().__init__({"type": "tool_result", "tool_result": tool_result})
    self._exception = exception

`ToolSpec` ¶

Bases: TypedDict

Specification for a tool that can be used by an agent.

Attributes:

Name	Type	Description
`description`	`str`	A human-readable description of what the tool does.
`inputSchema`	`JSONSchema`	JSON Schema defining the expected input parameters.
`name`	`str`	The unique name of the tool.
`outputSchema`	`NotRequired[JSONSchema]`	Optional JSON Schema defining the expected output format. Note: Not all model providers support this field. Providers that don't support it should filter it out before sending to their API.

Source code in strands/types/tools.py

class ToolSpec(TypedDict):
    """Specification for a tool that can be used by an agent.

    Attributes:
        description: A human-readable description of what the tool does.
        inputSchema: JSON Schema defining the expected input parameters.
        name: The unique name of the tool.
        outputSchema: Optional JSON Schema defining the expected output format.
            Note: Not all model providers support this field. Providers that don't
            support it should filter it out before sending to their API.
    """

    description: str
    inputSchema: JSONSchema
    name: str
    outputSchema: NotRequired[JSONSchema]

`ToolUse` ¶

Bases: TypedDict

A request from the model to use a specific tool with the provided input.

Attributes:

Name	Type	Description
`input`	`Any`	The input parameters for the tool. Can be any JSON-serializable type.
`name`	`str`	The name of the tool to invoke.
`toolUseId`	`str`	A unique identifier for this specific tool use request.
`reasoningSignature`	`NotRequired[str]`	Token that ties the model's reasoning to this tool call.

Source code in strands/types/tools.py

class ToolUse(TypedDict):
    """A request from the model to use a specific tool with the provided input.

    Attributes:
        input: The input parameters for the tool.
            Can be any JSON-serializable type.
        name: The name of the tool to invoke.
        toolUseId: A unique identifier for this specific tool use request.
        reasoningSignature: Token that ties the model's reasoning to this tool call.
    """

    input: Any
    name: str
    toolUseId: str
    reasoningSignature: NotRequired[str]

`ToolUseStreamEvent` ¶

Bases: ModelStreamEvent

Event emitted during tool use input streaming.

Source code in strands/types/_events.py

class ToolUseStreamEvent(ModelStreamEvent):
    """Event emitted during tool use input streaming."""

    def __init__(self, delta: ContentBlockDelta, current_tool_use: dict[str, Any]) -> None:
        """Initialize with delta and current tool use state."""
        super().__init__({"type": "tool_use_stream", "delta": delta, "current_tool_use": current_tool_use})

`init(delta, current_tool_use)` ¶

Initialize with delta and current tool use state.

Source code in strands/types/_events.py

def __init__(self, delta: ContentBlockDelta, current_tool_use: dict[str, Any]) -> None:
    """Initialize with delta and current tool use state."""
    super().__init__({"type": "tool_use_stream", "delta": delta, "current_tool_use": current_tool_use})

`stop_all(*funcs)` `async` ¶

Call all stops in sequence and aggregate errors.

A failure in one stop call will not block subsequent stop calls.

Parameters:

Name	Type	Description	Default
`funcs`	`Callable[..., Awaitable[None]]`	Stop functions to call in sequence.	`()`

Raises:

Type	Description
`RuntimeError`	If any stop function raises an exception.

Source code in strands/experimental/bidi/_async/__init__.py

async def stop_all(*funcs: Callable[..., Awaitable[None]]) -> None:
    """Call all stops in sequence and aggregate errors.

    A failure in one stop call will not block subsequent stop calls.

    Args:
        funcs: Stop functions to call in sequence.

    Raises:
        RuntimeError: If any stop function raises an exception.
    """
    exceptions = []
    for func in funcs:
        try:
            await func()
        except Exception as exception:
            exceptions.append({"func_name": func.__name__, "exception": repr(exception)})

    if exceptions:
        raise RuntimeError(f"exceptions={exceptions} | failed stop sequence")

strands.experimental.bidi.models.gemini_live ¶

AudioChannel = Literal[1, 2] module-attribute ¶

AudioSampleRate = Literal[16000, 24000, 48000] module-attribute ¶

BidiInputEvent = BidiTextInputEvent | BidiAudioInputEvent | BidiImageInputEvent module-attribute ¶

GEMINI_CHANNELS = 1 module-attribute ¶

GEMINI_INPUT_SAMPLE_RATE = 16000 module-attribute ¶

GEMINI_OUTPUT_SAMPLE_RATE = 24000 module-attribute ¶

Messages = list[Message] module-attribute ¶

logger = logging.getLogger(__name__) module-attribute ¶

AudioConfig ¶

BidiAudioInputEvent ¶

audio property ¶

channels property ¶

format property ¶

sample_rate property ¶

__init__(audio, format, sample_rate, channels) ¶

BidiAudioStreamEvent ¶

audio property ¶

channels property ¶

format property ¶

sample_rate property ¶

__init__(audio, format, sample_rate, channels) ¶

BidiConnectionStartEvent ¶

connection_id property ¶

model property ¶

__init__(connection_id, model) ¶

BidiGeminiLiveModel ¶

__init__(model_id='gemini-2.5-flash-native-audio-preview-09-2025', provider_config=None, client_config=None, **kwargs) ¶

receive() async ¶

send(content) async ¶

start(system_prompt=None, tools=None, messages=None, **kwargs) async ¶

stop() async ¶

BidiImageInputEvent ¶

image property ¶

mime_type property ¶

__init__(image, mime_type) ¶

BidiInterruptionEvent ¶

reason property ¶

__init__(reason) ¶

BidiModel ¶

receive() ¶

send(content) async ¶

start(system_prompt=None, tools=None, messages=None, **kwargs) async ¶

stop() async ¶

BidiModelTimeoutError ¶

__init__(message, **restart_config) ¶

BidiTextInputEvent ¶

role property ¶

text property ¶

__init__(text, role='user') ¶

BidiTranscriptStreamEvent ¶

current_transcript property ¶

delta property ¶

is_final property ¶

role property ¶

text property ¶

__init__(delta, text, role, is_final, current_transcript=None) ¶

BidiUsageEvent ¶

cache_read_input_tokens property ¶

cache_write_input_tokens property ¶

input_tokens property ¶

modality_details property ¶

output_tokens property ¶

total_tokens property ¶

__init__(input_tokens, output_tokens, total_tokens, modality_details=None, cache_read_input_tokens=None, cache_write_input_tokens=None) ¶

ModalityUsage ¶

ToolResult ¶

ToolResultEvent ¶

exception property ¶

tool_result property ¶

tool_use_id property ¶

__init__(tool_result, exception=None) ¶

ToolSpec ¶

ToolUse ¶

ToolUseStreamEvent ¶

__init__(delta, current_tool_use) ¶

stop_all(*funcs) async ¶

`strands.experimental.bidi.models.gemini_live` ¶

`AudioChannel = Literal[1, 2]` `module-attribute` ¶

`AudioSampleRate = Literal[16000, 24000, 48000]` `module-attribute` ¶

`BidiInputEvent = BidiTextInputEvent | BidiAudioInputEvent | BidiImageInputEvent` `module-attribute` ¶

`GEMINI_CHANNELS = 1` `module-attribute` ¶

`GEMINI_INPUT_SAMPLE_RATE = 16000` `module-attribute` ¶

`GEMINI_OUTPUT_SAMPLE_RATE = 24000` `module-attribute` ¶

`Messages = list[Message]` `module-attribute` ¶

`logger = logging.getLogger(name)` `module-attribute` ¶

`AudioConfig` ¶

`BidiAudioInputEvent` ¶

`audio` `property` ¶

`channels` `property` ¶

`format` `property` ¶

`sample_rate` `property` ¶

`init(audio, format, sample_rate, channels)` ¶

`BidiAudioStreamEvent` ¶

`audio` `property` ¶

`channels` `property` ¶

`format` `property` ¶

`sample_rate` `property` ¶

`init(audio, format, sample_rate, channels)` ¶

`BidiConnectionStartEvent` ¶

`connection_id` `property` ¶

`model` `property` ¶

`init(connection_id, model)` ¶

`BidiGeminiLiveModel` ¶

`init(model_id='gemini-2.5-flash-native-audio-preview-09-2025', provider_config=None, client_config=None, **kwargs)` ¶

`receive()` `async` ¶

`send(content)` `async` ¶

`start(system_prompt=None, tools=None, messages=None, **kwargs)` `async` ¶

`stop()` `async` ¶

`BidiImageInputEvent` ¶

`image` `property` ¶

`mime_type` `property` ¶

`init(image, mime_type)` ¶

`BidiInterruptionEvent` ¶

`reason` `property` ¶

`init(reason)` ¶

`BidiModel` ¶

`receive()` ¶

`send(content)` `async` ¶

`start(system_prompt=None, tools=None, messages=None, **kwargs)` `async` ¶

`stop()` `async` ¶

`BidiModelTimeoutError` ¶

`init(message, **restart_config)` ¶

`BidiTextInputEvent` ¶

`role` `property` ¶

`text` `property` ¶

`init(text, role='user')` ¶

`BidiTranscriptStreamEvent` ¶

`current_transcript` `property` ¶

`delta` `property` ¶

`is_final` `property` ¶

`role` `property` ¶

`text` `property` ¶

`init(delta, text, role, is_final, current_transcript=None)` ¶

`BidiUsageEvent` ¶

`cache_read_input_tokens` `property` ¶

`cache_write_input_tokens` `property` ¶

`input_tokens` `property` ¶

`modality_details` `property` ¶

`output_tokens` `property` ¶

`total_tokens` `property` ¶

`init(input_tokens, output_tokens, total_tokens, modality_details=None, cache_read_input_tokens=None, cache_write_input_tokens=None)` ¶

`ModalityUsage` ¶

`ToolResult` ¶

`ToolResultEvent` ¶

`exception` `property` ¶

`tool_result` `property` ¶

`tool_use_id` `property` ¶

`init(tool_result, exception=None)` ¶

`ToolSpec` ¶

`ToolUse` ¶

`ToolUseStreamEvent` ¶

`init(delta, current_tool_use)` ¶

`stop_all(*funcs)` `async` ¶