AI Agent Architecture - AWS Verified Permissions Demo

Overview

The AI Agent is the most sophisticated component of the demo. It allows users to ask natural language questions about permissions and get intelligent, contextual answers by combining Claude’s reasoning with real-time AVP authorization checks.

Why an AI Agent?

Traditional authorization UIs require technical knowledge:

Understanding user IDs and resource names
Knowing which actions are available
Interpreting authorization decisions

An AI agent makes authorization accessible:

“Can Alice read the Q4 report?” ✅
“Compare Bob’s and Carol’s access to HR documents” ✅
“Why can’t Alice delete the sales dashboard?” ✅

Architecture Pattern: Secure Proxy

The agent follows a secure proxy pattern to protect the Anthropic API key:

┌─────────┐                ┌─────────────┐                ┌────────────┐
│ Browser │───HTTPS───────▶│   Lambda    │───HTTPS───────▶│ Anthropic  │
│         │                │   Agent     │                │    API     │
└─────────┘                └──────┬──────┘                └────────────┘
                                  │
                                  │ is_authorized()
                                  ▼
                           ┌─────────────┐
                           │     AVP     │
                           │ Policy Store│
                           └─────────────┘

Key Security Property: The Anthropic API key never leaves the Lambda function. The browser only sends conversation messages.

Never expose API keys in frontend code. Always use a backend proxy pattern for third-party API calls.

Lambda Configuration

The AgentFunction has special configuration:

AgentFunction:
  Type: AWS::Serverless::Function
  Properties:
    FunctionName: avp-agent
    CodeUri: lambda/
    Handler: agent.lambda_handler
    Description: "Agente IA que consulta AVP — proxy seguro hacia Anthropic API"
    Timeout: 60  # Longer timeout for AI processing
    Environment:
      Variables:
        POLICY_STORE_ID: !Ref PolicyStoreId
        ANTHROPIC_API_KEY: !Ref AnthropicApiKey  # Secure API key

Key Differences from Other Functions:

Timeout: 60 seconds (vs 30) to accommodate multiple AI calls and tool executions
Environment: Includes ANTHROPIC_API_KEY from CloudFormation NoEcho parameter
IAM: Same verifiedpermissions:IsAuthorized permission as CheckAccessFunction

Request/Response Flow

Request Format

The browser sends conversation messages:

{
  "messages": [
    {
      "role": "user",
      "content": "Can Alice read the Q4 Report?"
    }
  ]
}

For multi-turn conversations, include the full message history:

{
  "messages": [
    {
      "role": "user",
      "content": "Can Alice read the Q4 Report?"
    },
    {
      "role": "assistant",
      "content": "Yes, Alice can read it because..."
    },
    {
      "role": "user",
      "content": "What about Bob?"
    }
  ]
}

Response Format

{
  "response": "Sí, Alice puede leer el Q4 Report porque...",
  "messages": [
    // Full conversation history including tool calls
  ]
}

The messages array includes internal tool calls for transparency and debugging.

The Agentic Loop

What is an Agentic Loop?

An agentic loop allows the AI to:

Reason about the user’s question
Decide if it needs to call a tool
Execute the tool
Incorporate results into its reasoning
Repeat until it has enough information
Respond to the user

This is more powerful than a simple function call because the AI decides when and how to use tools.

Implementation

The run_agent function (agent.py:94) implements the agentic loop:

def run_agent(messages):
    """
    Ejecuta el loop agentico:
    1. Llama a Claude con tools disponibles
    2. Si Claude quiere usar una tool → ejecútala → devuelve resultado
    3. Repite hasta end_turn
    """
    current_messages = list(messages)

    for _ in range(10):  # max 10 iteraciones
        # Call Anthropic API
        payload = json.dumps({
            "model":      "claude-haiku-4-5-20251001",
            "max_tokens": 1000,
            "system":     system,
            "tools":      tools,
            "messages":   current_messages
        }).encode()

        req = urllib.request.Request(
            "https://api.anthropic.com/v1/messages",
            data=payload,
            headers={
                "Content-Type":      "application/json",
                "x-api-key":         ANTHROPIC_API_KEY,
                "anthropic-version": "2023-06-01",
            },
            method="POST"
        )

        with urllib.request.urlopen(req) as resp:
            data = json.loads(resp.read())

        current_messages.append({"role":"assistant","content": data["content"]})

        # Check stop reason
        if data["stop_reason"] == "end_turn":
            # AI is done - extract response
            text = " ".join(b["text"] for b in data["content"] if b["type"] == "text")
            return {"response": text, "messages": current_messages}

        if data["stop_reason"] == "tool_use":
            # AI wants to use a tool
            tool_results = []
            for block in data["content"]:
                if block["type"] != "tool_use":
                    continue
                inp = block["input"]
                logger.info(f"Tool call: {block['name']}({inp})")
                result = check_avp_access(inp["user"], inp["action"], inp["resource"])
                tool_results.append({
                    "type":        "tool_result",
                    "tool_use_id": block["id"],
                    "content":     json.dumps(result)
                })
            current_messages.append({"role":"user","content": tool_results})
            # Loop continues - AI will process tool results

    return {"response": "No pude completar la consulta.", "messages": current_messages}

Loop Breakdown

Iteration 1: Initial Question

Input: “Can Alice read the Q4 Report?”
Claude reasoning: “I need to check AVP”
Stop reason: tool_use
Action: Call check_avp_access("alice", "Read", "Q4-Report-2024")

Iteration 2: Process Tool Result

Input: Tool result (ALLOW/DENY)
Claude reasoning: “Got the result, now I can answer”
Stop reason: end_turn
Action: Return natural language response

The loop can iterate up to 10 times, allowing Claude to make multiple tool calls for complex questions.

Anthropic API Integration

Model Selection

The agent uses Claude Haiku for fast, cost-effective responses:

"model": "claude-haiku-4-5-20251001"

Why Haiku?

Fast response times (1-2 seconds)
Low cost ($0.25 per 1M input tokens)
Sufficient reasoning for authorization questions
Excellent tool use capabilities

For more complex reasoning, you could upgrade to:

claude-sonnet-4-5-20250514: Better reasoning, moderate cost
claude-opus-4-20250514: Best reasoning, higher cost

API Request Format

The request follows Anthropic’s Messages API format:

payload = {
    "model":      "claude-haiku-4-5-20251001",
    "max_tokens": 1000,
    "system":     system,    # System prompt
    "tools":      tools,     # Tool definitions
    "messages":   messages   # Conversation history
}

Using urllib Instead of SDK

The code uses Python’s built-in urllib instead of the Anthropic SDK:

req = urllib.request.Request(
    "https://api.anthropic.com/v1/messages",
    data=payload,
    headers={
        "Content-Type":      "application/json",
        "x-api-key":         ANTHROPIC_API_KEY,
        "anthropic-version": "2023-06-01",
    },
    method="POST"
)

with urllib.request.urlopen(req) as resp:
    data = json.loads(resp.read())

Why not use the SDK?

Avoids adding dependencies to Lambda
Reduces cold start time
Smaller deployment package
Simple HTTP calls are sufficient for this use case

Tool Definitions

The check_avp_access Tool

Tools are defined in JSON schema format (agent.py:102):

tools = [{
    "name": "check_avp_access",
    "description": (
        "Verifica en AWS Verified Permissions si un usuario puede ejecutar "
        "una acción sobre un recurso. "
        "Usuarios: alice (Analyst/Finance), bob (Admin/Finance), carol (Auditor/HR). "
        "Acciones: Read, Edit, Delete. "
        "Recursos: Q4-Report-2024, HR-Payroll-2024, Sales-Dashboard."
    ),
    "input_schema": {
        "type": "object",
        "properties": {
            "user":     {"type":"string","description":"alice, bob, o carol"},
            "action":   {"type":"string","description":"Read, Edit, o Delete"},
            "resource": {"type":"string","description":"Q4-Report-2024, HR-Payroll-2024, o Sales-Dashboard"}
        },
        "required": ["user","action","resource"]
    }
}]

Key Components:

Name: Identifies the tool (must be unique)
Description: Tells Claude when and how to use the tool. This is critical - Claude uses this to decide whether to call the tool.
Input Schema: Defines the parameters using JSON Schema

The description is crucial for tool use. Be specific about what the tool does and include examples of valid inputs.

Tool Execution

When Claude wants to use the tool, the agent executes it:

if data["stop_reason"] == "tool_use":
    tool_results = []
    for block in data["content"]:
        if block["type"] != "tool_use":
            continue
        inp = block["input"]
        logger.info(f"Tool call: {block['name']}({inp})")
        result = check_avp_access(inp["user"], inp["action"], inp["resource"])
        tool_results.append({
            "type":        "tool_result",
            "tool_use_id": block["id"],
            "content":     json.dumps(result)
        })
    current_messages.append({"role":"user","content": tool_results})

The tool implementation (agent.py:41) is similar to the main CheckAccessFunction but returns a dict instead of HTTP response:

def check_avp_access(user_id, action_id, resource_id):
    if user_id not in DEMO_USERS:
        return {"error": f"Usuario '{user_id}' no existe"}
    if resource_id not in DEMO_RESOURCES:
        return {"error": f"Recurso '{resource_id}' no existe"}

    user = DEMO_USERS[user_id]
    resource = DEMO_RESOURCES[resource_id]

    try:
        response = avp_client.is_authorized(
            policyStoreId=POLICY_STORE_ID,
            principal={"entityType":"FinancialApp::User","entityId":user_id},
            action={"actionType":"FinancialApp::Action","actionId":action_id},
            resource={"entityType":"FinancialApp::Document","entityId":resource_id},
            entities={"entityList":[...]}
        )
        decision = response["decision"]
        return {
            "decision": decision,
            "allowed":  decision == "ALLOW",
            "user":     {**user, "id": user_id},
            "action":   action_id,
            "resource": resource_id,
            "resource_info": resource,
            "message": (
                f"✅ ACCESO PERMITIDO: {user['name']} puede {action_id} en {resource_id}"
                if decision == "ALLOW" else
                f"🚫 ACCESO DENEGADO: {user['name']} no tiene permisos para {action_id} en {resource_id}"
            )
        }
    except Exception as e:
        return {"error": str(e)}

System Prompt

The system prompt guides Claude’s behavior (agent.py:122):

system = (
    "Eres un agente de seguridad experto en AWS Verified Permissions. "
    "Responde preguntas sobre permisos usando la herramienta check_avp_access. "
    "NUNCA asumas el resultado — siempre verifica con la herramienta. "
    "Si preguntan por múltiples usuarios o recursos, verifica cada combinación. "
    "Explica brevemente por qué AVP tomó esa decisión (RBAC/ABAC/Cedar). "
    "Sé conciso. Responde siempre en español."
)

Prompt Design Principles:

Role Definition: “Eres un agente de seguridad experto”
Tool Guidance: “Responde preguntas… usando la herramienta”
Safety Constraint: “NUNCA asumas el resultado”
Completeness: “Si preguntan por múltiples… verifica cada combinación”
Explanation: “Explica brevemente por qué”
Style: “Sé conciso. Responde siempre en español”

Good system prompts are specific about behavior, include constraints, and set expectations for output format.

Example Interactions

Simple Query

User: “Can Alice read the Q4 Report?” Agent Process:

Call Anthropic API with user question
Claude decides to use check_avp_access
Execute: check_avp_access("alice", "Read", "Q4-Report-2024")
AVP returns: ALLOW
Claude synthesizes response

Response: “Sí, Alice puede leer el Q4 Report porque es del departamento Finance, igual que el documento, y su nivel de clearance (2) es suficiente para documentos confidenciales.”

Complex Multi-Step Query

User: “Compare Alice’s and Bob’s access to all Finance documents” Agent Process:

Claude identifies need for multiple checks
Calls:
- check_avp_access("alice", "Read", "Q4-Report-2024")
- check_avp_access("alice", "Edit", "Q4-Report-2024")
- check_avp_access("alice", "Delete", "Q4-Report-2024")
- check_avp_access("bob", "Read", "Q4-Report-2024")
- check_avp_access("bob", "Edit", "Q4-Report-2024")
- check_avp_access("bob", "Delete", "Q4-Report-2024")
Synthesizes results

Response: “Alice puede leer el Q4 Report pero no editarlo ni borrarlo. Bob, como Admin del departamento Finance, tiene acceso completo: puede leer, editar y borrar el documento.”

Error Handling

User: “Can Charlie read the Q4 Report?” Agent Process:

Call check_avp_access("charlie", "Read", "Q4-Report-2024")
Tool returns: {"error": "Usuario 'charlie' no existe"}
Claude handles error gracefully

Response: “El usuario ‘charlie’ no existe en el sistema. Los usuarios disponibles son: alice (Analyst), bob (Admin), y carol (Auditor).”

Performance Optimization

Parallel Tool Calls

Claude can make multiple tool calls in a single stop:

for block in data["content"]:
    if block["type"] == "tool_use":
        # Multiple tools can be called in one response

This reduces the number of round trips to Anthropic.

Token Usage

Typical token usage:

Simple query: ~200 input tokens, ~100 output tokens
Complex query: ~500 input tokens, ~300 output tokens

At Haiku pricing:

Input: $0.25 per 1M tokens
Output: $1.25 per 1M tokens
Cost per query: $0.0001 -$ 0.001 (very cheap)

Timeout Considerations

The 60-second timeout accommodates:

Up to 10 loop iterations
~3-5 seconds per Anthropic API call
Tool execution time (~50ms per AVP call)
Network latency

In practice, most queries complete in 5-10 seconds.

Security Considerations

API Key Protection

✅ Good: API key in Lambda environment variable

ANTHROPIC_API_KEY = os.environ["ANTHROPIC_API_KEY"]

❌ Bad: API key in frontend code

const API_KEY = "sk-ant-...";  // Never do this!

Input Validation

The agent validates inputs before calling AVP:

if user_id not in DEMO_USERS:
    return {"error": f"Usuario '{user_id}' no existe"}
if resource_id not in DEMO_RESOURCES:
    return {"error": f"Recurso '{resource_id}' no existe"}

Rate Limiting

Consider adding rate limiting for production:

from functools import lru_cache
import time

# Simple per-user rate limit
user_requests = {}

def check_rate_limit(user_id, limit=10, window=60):
    now = time.time()
    requests = user_requests.get(user_id, [])
    requests = [r for r in requests if r > now - window]
    if len(requests) >= limit:
        raise RateLimitError("Too many requests")
    requests.append(now)
    user_requests[user_id] = requests

Prompt Injection

The system prompt includes safety guidance:

"NUNCA asumas el resultado — siempre verifica con la herramienta."

This prevents Claude from making up authorization decisions.

Debugging the Agent

Enable Tool Call Logging

The agent logs every tool call:

logger.info(f"Tool call: {block['name']}({inp})")

Check CloudWatch Logs:

aws logs tail /aws/lambda/avp-agent --follow

Return Full Message History

The response includes the full conversation:

{
  "response": "...",
  "messages": [
    {"role": "user", "content": "Can Alice read Q4?"},
    {"role": "assistant", "content": [
      {"type": "tool_use", "name": "check_avp_access", ...}
    ]},
    {"role": "user", "content": [
      {"type": "tool_result", "content": "..."}
    ]},
    {"role": "assistant", "content": "Yes, Alice can..."}
  ]
}

Use this to debug the agentic loop.

Test Locally

You can test the agent logic without deploying:

if __name__ == "__main__":
    # Mock environment
    os.environ["POLICY_STORE_ID"] = "ps-test"
    os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
    
    # Mock event
    event = {
        "body": json.dumps({
            "messages": [{"role": "user", "content": "Test question"}]
        })
    }
    
    result = lambda_handler(event, None)
    print(result)

Extending the Agent

Add More Tools

You could add tools for:

tools = [
    {"name": "check_avp_access", ...},
    {"name": "list_user_permissions", ...},
    {"name": "explain_policy", ...},
    {"name": "suggest_policy_changes", ...}
]

Multi-Language Support

Modify the system prompt:

"Responde en el idioma que el usuario utilice."

Streaming Responses

For longer responses, use Anthropic’s streaming API:

req = urllib.request.Request(
    "https://api.anthropic.com/v1/messages",
    headers={"anthropic-beta": "messages-2023-12-15"}
)

Then use API Gateway WebSockets to stream to the frontend.

Comparison: Agent vs Traditional UI

Aspect	Traditional UI	AI Agent
Ease of Use	Requires understanding of users/resources	Natural language questions
Flexibility	Fixed set of queries	Open-ended questions
Explanation	Raw ALLOW/DENY	Contextual explanations
Discovery	User must know what to ask	Agent can suggest related queries
Complexity	Simple queries only	Can handle multi-step reasoning
Cost	Minimal (just Lambda)	Small AI API cost (~$0.0001/query)
Latency	50-100ms	2-5 seconds

Best Practices

Always Verify: Never let Claude assume - always call tools
Comprehensive System Prompts: Be specific about behavior and constraints
Good Tool Descriptions: Include examples and valid values
Error Handling: Handle tool errors gracefully
Logging: Log all tool calls for debugging
Rate Limiting: Prevent abuse in production
Timeout Tuning: Adjust based on complexity of queries
Cost Monitoring: Track Anthropic API usage

Get Started

Deployment

Architecture

Usage

Cedar Policies

​Overview

​Why an AI Agent?

​Architecture Pattern: Secure Proxy

​Lambda Configuration

​Request/Response Flow

​Request Format

​Response Format

​The Agentic Loop

​What is an Agentic Loop?

​Implementation

​Loop Breakdown

​Anthropic API Integration

​Model Selection

​API Request Format

​Using urllib Instead of SDK

​Tool Definitions

​The check_avp_access Tool

​Tool Execution

​System Prompt

​Example Interactions

​Simple Query

​Complex Multi-Step Query

​Error Handling

​Performance Optimization

​Parallel Tool Calls

​Token Usage

​Timeout Considerations

​Security Considerations

​API Key Protection

​Input Validation

​Rate Limiting

​Prompt Injection

​Debugging the Agent

​Enable Tool Call Logging

​Return Full Message History

​Test Locally

​Extending the Agent

​Add More Tools

​Multi-Language Support

​Streaming Responses

​Comparison: Agent vs Traditional UI

​Best Practices

​Next Steps

Architecture Overview

Quick Start

Build docs developers (and LLMs) love

Overview

Why an AI Agent?

Architecture Pattern: Secure Proxy

Lambda Configuration

Request/Response Flow

Request Format

Response Format

The Agentic Loop

What is an Agentic Loop?

Implementation

Loop Breakdown

Anthropic API Integration

Model Selection

API Request Format

Using urllib Instead of SDK

Tool Definitions

The check_avp_access Tool

Tool Execution

System Prompt

Example Interactions

Simple Query

Complex Multi-Step Query

Error Handling

Performance Optimization

Parallel Tool Calls

Token Usage

Timeout Considerations

Security Considerations

API Key Protection

Input Validation

Rate Limiting

Prompt Injection

Debugging the Agent

Enable Tool Call Logging

Return Full Message History

Test Locally

Extending the Agent

Add More Tools

Multi-Language Support

Streaming Responses

Comparison: Agent vs Traditional UI

Best Practices

Next Steps