Nova Sonic [Experimental]¶
Experimental Feature
This feature is experimental and may change in future versions. Use with caution in production environments.
Amazon Nova Sonic provides real-time, conversational interactions through bidirectional audio streaming. Amazon Nova Sonic processes and responds to real-time speech as it occurs, enabling natural, human-like conversational experiences. Key capabilities and features include:
- Adaptive speech response that dynamically adjusts delivery based on the prosody of the input speech.
- Graceful handling of user interruptions without dropping conversational context.
- Function calling and agentic workflow support for building complex AI applications.
- Robustness to background noise for real-world deployment scenarios.
- Multilingual support with expressive voices and speaking styles. Expressive voices are offered, including both masculine-sounding and feminine sounding, in five languages: English (US, UK), French, Italian, German, and Spanish.
- Recognition of varied speaking styles across all supported languages.
Usage¶
import asyncio
from strands.experimental.bidi import BidiAgent
from strands.experimental.bidi.io import BidiAudioIO, BidiTextIO
from strands.experimental.bidi.models import BidiNovaSonicModel
from strands.experimental.bidi.tools import stop_conversation
from strands_tools import calculator
async def main() -> None:
model = BidiNovaSonicModel(
model_id="amazon.nova-sonic-v1:0",
provider_config={
"audio": {
"voice": "tiffany",
},
},
client_config={"region": "us-east-1"}, # only available in us-east-1, eu-north-1, and ap-northeast-1
)
# stop_conversation tool allows user to verbally stop agent execution.
agent = BidiAgent(model=model, tools=[calculator, stop_conversation])
audio_io = BidiAudioIO()
text_io = BidiTextIO()
await agent.run(inputs=[audio_io.input()], outputs=[audio_io.output(), text_io.output()])
if __name__ == "__main__":
asyncio.run(main())
Credentials¶
Nova Sonic is only available in us-east-1, eu-north-1, and ap-northeast-1.
Nova Sonic requires AWS credentials for access. Under the hook, BidiNovaSonicModel uses an experimental Bedrock client, which allows for credentials to be configured in the following ways:
Option 1: Environment Variables
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_SESSION_TOKEN=your_session_token # If using temporary credentials
export AWS_REGION=your_region_name
Option 2: Boto3 Session
import boto3
from strands.experimental.bidi.models import BidiNovaSonicModel
boto_session = boto3.Session(
aws_access_key_id="your_access_key",
aws_secret_access_key="your_secret_key",
aws_session_token="your_session_token", # If using temporary credentials
region_name="your_region_name",
profile_name="your_profile" # Optional: Use a specific profile
)
model = BidiNovaSonicModel(client_config={"boto_session": boto_session})
For more details on this approach, please refer to the boto3 session docs.
Configuration¶
Client Configs¶
| Parameter | Description | Default |
|---|---|---|
boto3_session |
A boto3.Session instance under which AWS credentials are configured. |
None |
region |
Region under which credentials are configured. Cannot use if providing boto3_session. |
us-east-1 |
Provider Configs¶
| Parameter | Description | Example | Options |
|---|---|---|---|
audio |
AudioConfig instance. |
{"voice": "tiffany"} |
reference |
inference |
Session start inferenceConfiguration's (as snake_case). |
{"top_p": 0.9} |
reference |
Troubleshooting¶
Hanging¶
When credentials are misconfigured, the model provider does not throw an exception (a quirk of the underlying experimental Bedrock client). As a result, the provider allows the user to proceed forward with a call to receive, which emits no events and thus presents an indefinite hanging behavior.
As a reminder, Nova Sonic is only available in us-east-1, eu-north-1, and ap-northeast-1.