Let's Build an Agent on AWS!

Let's Build an Agent on AWS!

AWS is going all-in on AI, and they're making it easier than ever to build agents in their ecosystem. They've recently released two key pieces: Amazon Bedrock AgentCore (a managed runtime for deploying agents) and Strands (an open-source Python framework for building them).

This article is aimed toward you, developers, or those who have some familiarity with code. If you want to check out articles talking about the general concepts surrounding AI, check out these posts from me and my colleagues:

A Quick Agent Recap

Let's quickly walk through what an agent is from a technical perspective.

TLDR;

An agent is a loop that, with the help of different tools, will enrich the query being sent to the LLM and continue to query the LLM with new information until a task is complete or a question is answered.

But What Does This Mean in Practice?

When you ask an agent a question or give an agent a task to perform, it will try to answer your question based on the instructions it's been given. These instructions can be just plain text with something like "Answer all questions in Swedish", which would make the agent only return answers in Swedish. But when you dive into the more complex setups, the instructions are often complemented with details that make agents so powerful, such as:

  • The ability to query specific internal data (Data retrieval/RAG)

  • Read history from previous conversations (Memory)

  • Have ways to perform tasks or retrieve external information (Tools & MCP Servers)

  • Be aware of who is asking the question (User context)

A pseudo example of a prompt inside an agentic loop could look something like:

History:
USER: Question
AGENT: Answer

User context:
UserID: 123
UserName: Name

Context:
<Information from previous tool calls>

Instructions:
You are a seasoned software engineer who answers questions about best practices on AWS. ALWAYS answer the questions in Swedish.

Tools Available:
You have the following tools available. Answer with a TOOL_CALL when you need to use a tool.
- Search the web (parameters: question)
- AWS Documentation MCP (parameters: service)

Definitions:
A TOOL_CALL should always be returned as JSON following the format:
{"tool_id": "id", "parameters": {"example_param": "abc"}}

Question:
USER: How much memory can I configure a Lambda to have?

This prompt will continuously be reused, added to, and sometimes redacted from until a goal is reached, and sent multiple times to the LLM. Ask a follow-up question? History will be extended, context will be increased with potential new information using tools, and sent to the LLM to generate a potential answer.

And to make it clear, the LLM itself doesn't do anything here—all the extending and actual work with tools and the prompt comes from the agentic frameworks themselves.

As an example, if a TOOL_CALL is requested from the LLM, an agentic framework will parse this, and by regular software engineering, figure out that it's time to use an actual piece of code with the parameters that the LLM wanted to use and pass that information back to the next prompt being sent to the LLM.

This concept and logic are what an agentic framework solves for you.

Let's Get Building

Given the concepts above, the industry is slowly standardizing around concepts on how to achieve these steps needed for an effective agentic loop. Most frameworks similarly approach this, and nowadays it's usually the developer experience or choice of language that makes the difference.

What's Needed to Get Going

Runtime

When we actually want to deploy our agent, AWS approaches this with AWS Bedrock AgentCore, which enables you to quickly set up the infrastructure you need to actually run an agent on the web. This includes, for example, the runtime for the actual agent (container-based), as well as the infrastructure behind the tools an agent needs, such as memory, search, and code execution.

Check out their page: https://aws.amazon.com/bedrock/agentcore/ for more details on what the runtime offers.

Framework

AWS has also released Strands, which is an open-source Python framework for agents. Strands is extremely quick to get started with and follows a model-agnostic approach, and can be run anywhere—which is a nice surprise coming from AWS.

You can find all details about Strands at: https://strandsagents.com/

What Are We Building?

Now that we have both a framework (Strands) and a deployment platform (AgentCore), let's build something practical: an agent that can search the web and query AWS documentation. By the end, we'll deploy it to AWS so other services can use it.

Requirements

https://strandsagents.com/latest/documentation/docs/ includes a straightforward getting-started guide. To set up the example, we will use the following Python and AWS setup:

First Steps

Let's get started (the steps are written from a Mac perspective):

  1. Create a new project folder

  2. Using your terminal, go to the folder and run uv init to set up a new project

  3. Activate your virtual environment via source .venv/bin/activate

  4. Add the Strands library via uv add strands-agents

  5. In the terminal, set your AWS credentials via environment variables

We now should have everything we need to get going to test out their example.

Copy-paste this code into the main.py file (taken from the Strands docs):

from strands import Agent

# Create an agent with default settings
agent = Agent()

# Ask the agent a question
agent("Tell me about agentic AI in one sentence")

And let's run it via python main.py. If all works as expected, you should have an output similar to:

Perfect! We are now up and running and can start adding functionality.

We want the agent to be able to search the web if it doesn't know the answer. Strands comes with a bunch of community-built tools which you can just install. Let's add the tool built for the AgentCore browser. (All community-built tools can be found at https://github.com/strands-agents/tools)

Let's run: uv add 'strands-agents-tools[agent_core_browser]' to add the package and add the tool to our agent.

from strands import Agent
from strands_tools.browser import AgentCoreBrowser

browser_tool = AgentCoreBrowser(
    region="eu-west-1"
) # Initialize the browser in a supported region

agent = Agent(tools=[browser_tool.browser]) # Add the browser tool to the agent

# Ask the agent to search the internet
agent("Use the browser to get the title of the latest AWS News Blog post.")

When we run it, we can now see that the agent is using the tool to search for the latest blog post from AWS.

What About MCP Servers?

Since MCP servers are a very powerful addition to add more abilities to your agents, let's also try adding one.

AWS has a lot of ready-to-use MCP servers. Let's choose one of their remote servers that is ready to consume. https://github.com/awslabs/mcp/tree/main/src/aws-knowledge-mcp-server is a good example.

Add the MCP library via uv add mcp, and let's update the code to include an MCP server:

from strands import Agent
from strands_tools.browser import AgentCoreBrowser
from mcp.client.streamable_http import streamablehttp_client
from strands.tools.mcp.mcp_client import MCPClient

aws_knowledge_mcp = MCPClient(
    lambda: streamablehttp_client("https://knowledge-mcp.global.api.aws")
)
browser_tool = AgentCoreBrowser(region="eu-west-1")

with aws_knowledge_mcp:
    # Get the tools available from the MCP server
    aws_knowledge_tools = aws_knowledge_mcp.list_tools_sync()
    # Combine the browser tool with the MCP tools
    tools = [browser_tool.browser] + aws_knowledge_tools

    agent = Agent(tools=tools)
    agent("What tools do you have available?")

After running the agent again, you can now see that we have enabled more tools for the agent, which will give it the ability to answer more questions or perform more of the tasks you want it to be able to do.

We now have an agent who can both search the internet and use MCP servers using Strands. In many ways, it is this simple to create the backend for your own ChatGPT clone. It's becoming very trivial to enable this type of functionality based on an LLM.

But how can I share this with my colleagues and friends? Let's move on to deploying this.

Deployment

At this point, we have a working agent that runs locally. You could stop here and integrate this agent directly into your existing applications—just instantiate it in your API endpoints, Lambda functions, or any other service where you need AI-powered automation

But since we are exploring AWS Bedrock AgentCore, let's check out their runtime, which is a way to enable you to run your agent very quickly on AWS.

We'll follow their SDK guide, which makes this very quick and easy: https://strandsagents.com/latest/documentation/docs/user-guide/deploy/deploy_to_bedrock_agentcore/#option-a-sdk-integration

Amazon Bedrock AgentCore SDK

Start by adding the SDK library uv add bedrock-agentcore and let's update the code based on their example:

from strands import Agent
from strands_tools.browser import AgentCoreBrowser
from mcp.client.streamable_http import streamablehttp_client
from strands.tools.mcp.mcp_client import MCPClient
from bedrock_agentcore import BedrockAgentCoreApp

app = BedrockAgentCoreApp()
aws_knowledge_mcp = MCPClient(
    lambda: streamablehttp_client("https://knowledge-mcp.global.api.aws")
)

browser_tool = AgentCoreBrowser(region="eu-west-1")

# Create which function acts as the entrypoint for the agent
@app.entrypoint
async def agent_invocation(payload):
    """Handler for agent invocation"""

    user_message = payload.get(
        "prompt", "No prompt found in input, please guide customer to create a JSON payload with prompt key",
    )
    # Create an agent with MCP tools
    with aws_knowledge_mcp:
        # Get the tools from the MCP server
        aws_knowledge_tools = aws_knowledge_mcp.list_tools_sync()
        # Combine the browser tool with the MCP tools
        tools = [browser_tool.browser] + aws_knowledge_tools

        agent = Agent(tools=tools)
        stream = agent.stream_async(user_message)
        async for event in stream:
            print(event)
            yield (event)

if __name__ == "__main__":
    app.run()

You can now run this locally with uv run main.py, which will start a web server that hosts the agent behind an endpoint. When it's running, you can test it via, for example, curl using the default endpoints exposed by AgentCore.

curl -X POST http://localhost:8080/invocations \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello world!"}'

Since we opted to use the streaming response pattern, this would be a useful way of making it feel alive when, for example, building a chat interface.

Prepare for Deployment

The AgentCore SDK comes with a CLI for easy deployment and testing without the need to configure any infrastructure.

Run: uv run agentcore configure --entrypoint main.py, which will run you through a few steps to create a .bedrock_agentcore.yaml document that includes the details related to your deployment.

To try out the agent via AgentCore locally, you can then run uv run agentcore launch --local. This will build the Docker image that will be used when deployed and make sure everything is working as expected. Once again, use curl to invoke localhost to try out the endpoints—but this time via the Docker container which will be used for the actual deployment.

Let's Deploy It

When you are ready to actually deploy this to AWS, just run uv run agentcore launch. Given the specifications in .bedrock_agentcore.yaml, it will now deploy the required roles, push the Dockerfile to AWS, and make the instance available to be invoked from the AWS ecosystem.

Deploying Updates: Added a new tool or updated your agent logic? Just run uv run agentcore launch again. The CLI handles the rebuild, pushes the new container image, and updates your deployed agent—no manual infrastructure changes needed. Your agent ARN stays the same, so any existing integrations continue working seamlessly.

You can find the full code at: https://github.com/elva-labs/strands-agentcore-blog-example

But How Do I Actually Use It?

If you are familiar with the AWS ecosystem, especially services such as AWS Lambda, a deployed agent works very much the same way. Meaning that you can, via the regular AWS SDK, invoke the agent and use the response in any manner you see fit. Check out https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-invoke-agent.html for details related to this.

Here is an example of how you could invoke your deployed agent using the boto3 framework:

import boto3
import json

# Initialize the Bedrock AgentCore client
agent_core_client = boto3.client('bedrock-agentcore')

# Prepare the payload
payload = json.dumps({"prompt": prompt}).encode()

# Invoke the agent
response = agent_core_client.invoke_agent_runtime(
    agentRuntimeArn=agent_arn,
    runtimeSessionId=session_id,
    payload=payload
)

# Process and print the response
if "text/event-stream" in response.get("contentType", ""):
    # Handle streaming response
    content = []
    for line in response["response"].iter_lines(chunk_size=10):
        if line:
            line = line.decode("utf-8")
            if line.startswith("data: "):
                line = line[6:]
                print(line)
                content.append(line)
    print("\nComplete response:", "\n".join(content))

elif response.get("contentType") == "application/json":
    # Handle standard JSON response
    content = []
    for chunk in response.get("response", []):
        content.append(chunk.decode('utf-8'))
    print(json.loads(''.join(content)))

else:
    # Print raw response for other content types
    print(response)

Final Thoughts

AWS has, in my opinion, made the correct strategic move here toward a fully open-source agent framework that is very quick to get started with and has out-of-the-box support for the AgentCore runtime.

It also seems like AWS Bedrock AgentCore is their best bet for simpler ways of deploying infrastructure compared to the rest of AWS. As an experienced power user of AWS, I'm very comfortable with their ecosystem, as it enables me to use advanced patterns to solve almost any issue at any scale. But for many people entering the AI scene, much of this is unknown territory. By just having a simple CLI and a few commands to get running, they're actually becoming a challenger in the consumer space when it comes to agents, since the steep learning curve AWS traditionally comes with won't really work for this new wave of users.

Overall, I think the combination of AgentCore and Strands is a really nice concept that's getting close to becoming something I'd personally actually use in production. It still lacks some developer experience compared to more popular and larger frameworks, and is still a bit clunky in how you integrate it with your services. The AWS-specific flavor makes it a bit harder to integrate out of the box with, for example, the AI SDK, which powers a lot of chat frontends today.

All that said, I'm very excited by this step and looking forward to following the progress AWS will be making in this space over the next year.

What's Next?

Ready to take this further? Here are some next steps:

Explore More MCP Servers:

Production Considerations:

  • Set up CloudWatch monitoring for your deployed agent via AgentCore Observability

  • Review the AgentCore security best practices

  • Remember: AgentCore pricing is consumption-based, so you only pay for what you use

What's the one repetitive task in your workflow you'd automate first with an agent? I'd love to hear what you're building!


If you enjoyed this post, want to know more about me, working at Elva, or just want to reach out, you can find me on LinkedIn.


Elva is a serverless-first consulting company that can help you transform or begin your AWS journey for the future.