Nova Sonic [Experimental]¶

Experimental Feature

This feature is experimental and may change in future versions. Use with caution in production environments.

Amazon Nova Sonic provides real-time, conversational interactions through bidirectional audio streaming. Amazon Nova Sonic processes and responds to real-time speech as it occurs, enabling natural, human-like conversational experiences. Key capabilities and features include:

Adaptive speech response that dynamically adjusts delivery based on the prosody of the input speech.
Graceful handling of user interruptions without dropping conversational context.
Function calling and agentic workflow support for building complex AI applications.
Robustness to background noise for real-world deployment scenarios.
Multilingual support with expressive voices and speaking styles. Expressive voices are offered, including both masculine-sounding and feminine sounding, in five languages: English (US, UK), French, Italian, German, and Spanish.
Recognition of varied speaking styles across all supported languages.

Usage¶

import asyncio

from strands.experimental.bidi import BidiAgent
from strands.experimental.bidi.io import BidiAudioIO, BidiTextIO
from strands.experimental.bidi.models import BidiNovaSonicModel
from strands.experimental.bidi.tools import stop_conversation

from strands_tools import calculator


async def main() -> None:
    model = BidiNovaSonicModel(
        model_id="amazon.nova-sonic-v1:0",
        provider_config={
            "audio": {
                "voice": "tiffany",
            },
        },
        client_config={"region": "us-east-1"},  # only available in us-east-1, eu-north-1, and ap-northeast-1
    )
    # stop_conversation tool allows user to verbally stop agent execution.
    agent = BidiAgent(model=model, tools=[calculator, stop_conversation])

    audio_io = BidiAudioIO()
    text_io = BidiTextIO()
    await agent.run(inputs=[audio_io.input()], outputs=[audio_io.output(), text_io.output()])


if __name__ == "__main__":
    asyncio.run(main())

Credentials¶

Nova Sonic is only available in us-east-1, eu-north-1, and ap-northeast-1.

Nova Sonic requires AWS credentials for access. Under the hook, BidiNovaSonicModel uses an experimental Bedrock client, which allows for credentials to be configured in the following ways:

Option 1: Environment Variables

export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_SESSION_TOKEN=your_session_token  # If using temporary credentials
export AWS_REGION=your_region_name

Option 2: Boto3 Session

import boto3
from strands.experimental.bidi.models import BidiNovaSonicModel


boto_session = boto3.Session(
    aws_access_key_id="your_access_key",
    aws_secret_access_key="your_secret_key",
    aws_session_token="your_session_token",  # If using temporary credentials
    region_name="your_region_name",
    profile_name="your_profile"  # Optional: Use a specific profile
)
model = BidiNovaSonicModel(client_config={"boto_session": boto_session})

For more details on this approach, please refer to the boto3 session docs.

Configuration¶

Client Configs¶

Parameter	Description	Default
`boto3_session`	A `boto3.Session` instance under which AWS credentials are configured.	`None`
`region`	Region under which credentials are configured. Cannot use if providing `boto3_session`.	`us-east-1`

Provider Configs¶

Parameter	Description	Example	Options
`audio`	`AudioConfig` instance.	`{"voice": "tiffany"}`	reference
`inference`	Session start `inferenceConfiguration`'s (as snake_case).	`{"top_p": 0.9}`	reference

Troubleshooting¶

Hanging¶

When credentials are misconfigured, the model provider does not throw an exception (a quirk of the underlying experimental Bedrock client). As a result, the provider allows the user to proceed forward with a call to receive, which emits no events and thus presents an indefinite hanging behavior.

As a reminder, Nova Sonic is only available in us-east-1, eu-north-1, and ap-northeast-1.