Amazon Bedrock

Configure Amazon Bedrock as an LLM provider in agentgateway.

Before you begin

  1. Set up an agentgateway proxy.
  2. Make sure that your Amazon credentials have access to the Bedrock models that you want to use. You can alternatively use an AWS Bedrock API key.

Set up access to Amazon Bedrock

  1. Store your credentials to access the AWS Bedrock API.

    1. Log in to the AWS console and store your access credentials as environment variables.

      export AWS_ACCESS_KEY_ID="<aws-access-key-id>"
      export AWS_SECRET_ACCESS_KEY="<aws-secret-access-key>"
      export AWS_SESSION_TOKEN="<aws-session-token>"
    2. Create a secret with your Bedrock API key. Optionally provide the session token.

      kubectl create secret generic bedrock-secret \
        -n agentgateway-system \
        --from-literal=accessKey="$AWS_ACCESS_KEY_ID" \
        --from-literal=secretKey="$AWS_SECRET_ACCESS_KEY" \
        --from-literal=sessionToken="$AWS_SESSION_TOKEN" \
        --type=Opaque \
        --dry-run=client -o yaml | kubectl apply -f -
    1. Save the API key in an environment variable.

      export BEDROCK_API_KEY=<insert your API key>
    2. Create a Kubernetes secret to store your Amazon Bedrock API key.

      kubectl apply -f- <<EOF
      apiVersion: v1
      kind: Secret
      metadata:
        name: bedrock-secret
        namespace: agentgateway-system
      type: Opaque
      stringData:
        Authorization: $BEDROCK_API_KEY
      EOF

  2. Create an AgentgatewayBackend resource to configure your LLM provider. Make sure to reference the secret that holds your credentials to access the LLM.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: bedrock
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          bedrock:
            model: "amazon.nova-micro-v1:0"
            region: "us-east-1"
      policies:
        auth:
          secretRef:
            name: bedrock-secret
    EOF

    Review the following table to understand this configuration. For more information, see the API reference.

    SettingDescription
    ai.provider.bedrockDefine the LLM provider that you want to use. The example uses Amazon Bedrock.
    bedrock.modelThe model to use to generate responses. In this example, you use the amazon.nova-micro-v1:0 model. Keep in mind that some models support cross-region inference. These models begin with a us. prefix, such as us.anthropic.claude-sonnet-4-20250514-v1:0. For more models, see the AWS Bedrock docs.
    bedrock.regionThe AWS region where your Bedrock model is deployed. Multiple regions are not supported.
    policies.authProvide the credentials to use to access the Amazon Bedrock API. The example refers to the secret that you previously created. To use IRSA, omit the auth settings.
  3. Create an HTTPRoute resource to route requests through your agentgateway proxy to the Bedrock AgentgatewayBackend. The following example sets up a route. Note that agentgateway automatically rewrites the endpoint to the appropriate chat completion endpoint of the LLM provider for you, based on the LLM provider that you set up in the AgentgatewayBackend resource. The default Bedrock route is /model/${MODEL}/converse, such as /model/amazon.nova-micro-v1:0/converse.

    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: bedrock
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
      - backendRefs:
        - name: bedrock
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    EOF
    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: bedrock
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /v1/chat/completions
        backendRefs:
        - name: bedrock
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    EOF
    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: bedrock
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /bedrock
        backendRefs:
        - name: bedrock
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    EOF
  4. Send a request to the LLM provider API along the route that you previously created, such as /bedrock or /v1/chat/completions depending on your route configuration. Verify that the request succeeds and that you get back a response from the chat completion API.

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/model/amazon.nova-micro-v1:0/converse" -H content-type:application/json -d '{
        "model": "",
        "messages": [
          {
            "role": "user",
            "content": "You are a cloud native solutions architect, skilled in explaining complex technical concepts such as API Gateway, microservices, LLM operations, kubernetes, and advanced networking patterns. Write me a 20-word pitch on why I should use an AI gateway in my Kubernetes cluster."
          }
        ]
      }' | jq

    Localhost:

    curl "localhost:8080/model/amazon.nova-micro-v1:0/converse" -H content-type:application/json -d '{
        "model": "",
        "messages": [
          {
            "role": "user",
            "content": "You are a cloud native solutions architect, skilled in explaining complex technical concepts such as API Gateway, microservices, LLM operations, kubernetes, and advanced networking patterns. Write me a 20-word pitch on why I should use an AI gateway in my Kubernetes cluster."
          }
        ]
      }' | jq

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{
        "model": "",
        "messages": [
          {
            "role": "user",
            "content": "You are a cloud native solutions architect, skilled in explaining complex technical concepts such as API Gateway, microservices, LLM operations, kubernetes, and advanced networking patterns. Write me a 20-word pitch on why I should use an AI gateway in my Kubernetes cluster."
          }
        ]
      }' | jq

    Localhost:

    curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
        "model": "",
        "messages": [
          {
            "role": "user",
            "content": "You are a cloud native solutions architect, skilled in explaining complex technical concepts such as API Gateway, microservices, LLM operations, kubernetes, and advanced networking patterns. Write me a 20-word pitch on why I should use an AI gateway in my Kubernetes cluster."
          }
        ]
      }' | jq

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/bedrock" -H content-type:application/json -d '{
        "model": "",
        "messages": [
          {
            "role": "user",
            "content": "You are a cloud native solutions architect, skilled in explaining complex technical concepts such as API Gateway, microservices, LLM operations, kubernetes, and advanced networking patterns. Write me a 20-word pitch on why I should use an AI gateway in my Kubernetes cluster."
          }
        ]
      }' | jq

    Localhost:

    curl "localhost:8080/bedrock" -H content-type:application/json -d '{
        "model": "",
        "messages": [
          {
            "role": "user",
            "content": "You are a cloud native solutions architect, skilled in explaining complex technical concepts such as API Gateway, microservices, LLM operations, kubernetes, and advanced networking patterns. Write me a 20-word pitch on why I should use an AI gateway in my Kubernetes cluster."
          }
        ]
      }' | jq

    Example output:

    {
      "metrics": {
        "latencyMs": 2097
      },
      "output": {
        "message": {
          "content": [
            {
              "text": "\nAn AI gateway in your Kubernetes cluster can enhance performance, scalability, and security while simplifying complex operations. It provides a centralized entry point for AI workloads, automates deployment and management, and ensures high availability."
            }
          ],
          "role": "assistant"
        }
      },
      "stopReason": "end_turn",
      "usage": {
        "inputTokens": 60,
        "outputTokens": 47,
        "totalTokens": 107
      }
    }

Prompt caching

Prompt Caching is a performance, cost-optimization, and cost-reduction feature that allows the model to “remember” frequently used parts of your prompt, including long system instructions, reference documents, or tool definitions. This way, the model does not need to reprocess these parts every time you send a new prompt.

For example, let’s assume you have a 50-page manual and you want to ask your model different questions about the manual. Instead of re-reading the manual for each question, the model can read it once and save it in its internal cache. Then, the model can answer subsequent questions more quickly and more cost efficient.

Prompt caching is configured by using the backend.ai.promptCaching fields in the AgentgatewayPolicy resource.

ℹ️
Prompt caching is supported for Bedrock Claude 3+ and Nova models.
  1. Create an AgentgatewayPolicy resource with your prompt cache settings. The following example enables caching for system prompts and conversation messages, but disables it for tool definitions. Bedrock requires you to set the minimum token count after which caching is enabled. By default, a minimum of 1024 tokens are required by Bedrock for caching to be effective. This is also referred to as a caching checkpoint. For more information, see the API reference.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayPolicy
    metadata:
      name: bedrock-caching-policy
      namespace: agentgateway-system
    spec:
      targetRefs:
        - group: gateway.networking.k8s.io
          kind: HTTPRoute
          name: bedrock
      backend:
        ai:
          promptCaching:
            cacheSystem: true
            cacheMessages: true
            cacheTools: false
            minTokens: 1024
    EOF
  2. Port-forward the agentgateway proxy on port 15000.

    kubectl port-forward deploy/agentgateway-proxy -n agentgateway-system 15000
  3. Get the caching configuration and verify that you see the cache settings.

    curl -s http://localhost:15000/config_dump | jq '.policies[] |                                    
     select(.name.name == "bedrock-caching-policy" and 
          .policy.backend.aI.promptCaching != null)'

    Example output:

    {
       "key": "backend/agentgateway-system/bedrock-caching-policy:ai:agentgateway-system/bedrock",
       "name": {
         "kind": "AgentgatewayPolicy",
         "name": "bedrock-caching-policy",
         "namespace": "agentgateway-system"
       },
       "target": {
         "route": {
           "name": "bedrock",
           "namespace": "agentgateway-system",
           "kind": "HTTPRoute"
         }
       },
       "policy": {
         "backend": {
           "aI": {
             "defaults": {},
             "overrides": {},
             "promptCaching": {
               "cacheSystem": true,
               "cacheMessages": true,
               "cacheTools": false,
               "minTokens": 1024
             }
           }
         }
       }
    }
    

Next steps

Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.