Skip to content

Nova Sonic [Experimental]

Experimental Feature

This feature is experimental and may change in future versions. Use with caution in production environments.

Amazon Nova Sonic provides real-time, conversational interactions through bidirectional audio streaming. Amazon Nova Sonic processes and responds to real-time speech as it occurs, enabling natural, human-like conversational experiences. Key capabilities and features include:

  • Adaptive speech response that dynamically adjusts delivery based on the prosody of the input speech.
  • Graceful handling of user interruptions without dropping conversational context.
  • Function calling and agentic workflow support for building complex AI applications.
  • Robustness to background noise for real-world deployment scenarios.
  • Multilingual support with expressive voices and speaking styles. Expressive voices are offered, including both masculine-sounding and feminine sounding, in five languages: English (US, UK), French, Italian, German, and Spanish.
  • Recognition of varied speaking styles across all supported languages.

Usage

import asyncio

from strands.experimental.bidi import BidiAgent
from strands.experimental.bidi.io import BidiAudioIO, BidiTextIO
from strands.experimental.bidi.models import BidiNovaSonicModel
from strands.experimental.bidi.tools import stop_conversation

from strands_tools import calculator


async def main() -> None:
    model = BidiNovaSonicModel(
        model_id="amazon.nova-sonic-v1:0",
        provider_config={
            "audio": {
                "voice": "tiffany",
            },
        },
        client_config={"region": "us-east-1"},  # only available in us-east-1, eu-north-1, and ap-northeast-1
    )
    # stop_conversation tool allows user to verbally stop agent execution.
    agent = BidiAgent(model=model, tools=[calculator, stop_conversation])

    audio_io = BidiAudioIO()
    text_io = BidiTextIO()
    await agent.run(inputs=[audio_io.input()], outputs=[audio_io.output(), text_io.output()])


if __name__ == "__main__":
    asyncio.run(main())

Credentials

Nova Sonic is only available in us-east-1, eu-north-1, and ap-northeast-1.

Nova Sonic requires AWS credentials for access. Under the hook, BidiNovaSonicModel uses an experimental Bedrock client, which allows for credentials to be configured in the following ways:

Option 1: Environment Variables

export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_SESSION_TOKEN=your_session_token  # If using temporary credentials
export AWS_REGION=your_region_name

Option 2: Boto3 Session

import boto3
from strands.experimental.bidi.models import BidiNovaSonicModel


boto_session = boto3.Session(
    aws_access_key_id="your_access_key",
    aws_secret_access_key="your_secret_key",
    aws_session_token="your_session_token",  # If using temporary credentials
    region_name="your_region_name",
    profile_name="your_profile"  # Optional: Use a specific profile
)
model = BidiNovaSonicModel(client_config={"boto_session": boto_session})

For more details on this approach, please refer to the boto3 session docs.

Configuration

Client Configs

Parameter Description Default
boto3_session A boto3.Session instance under which AWS credentials are configured. None
region Region under which credentials are configured. Cannot use if providing boto3_session. us-east-1

Provider Configs

Parameter Description Example Options
audio AudioConfig instance. {"voice": "tiffany"} reference
inference Session start inferenceConfiguration's (as snake_case). {"top_p": 0.9} reference

Troubleshooting

Hanging

When credentials are misconfigured, the model provider does not throw an exception (a quirk of the underlying experimental Bedrock client). As a result, the provider allows the user to proceed forward with a call to receive, which emits no events and thus presents an indefinite hanging behavior.

As a reminder, Nova Sonic is only available in us-east-1, eu-north-1, and ap-northeast-1.

References