Overview
The AI Agent is the most sophisticated component of the demo. It allows users to ask natural language questions about permissions and get intelligent, contextual answers by combining Claude’s reasoning with real-time AVP authorization checks.
Why an AI Agent?
Traditional authorization UIs require technical knowledge:
Understanding user IDs and resource names
Knowing which actions are available
Interpreting authorization decisions
An AI agent makes authorization accessible:
“Can Alice read the Q4 report?” ✅
“Compare Bob’s and Carol’s access to HR documents” ✅
“Why can’t Alice delete the sales dashboard?” ✅
Architecture Pattern: Secure Proxy
The agent follows a secure proxy pattern to protect the Anthropic API key:
┌─────────┐ ┌─────────────┐ ┌────────────┐
│ Browser │───HTTPS───────▶│ Lambda │───HTTPS───────▶│ Anthropic │
│ │ │ Agent │ │ API │
└─────────┘ └──────┬──────┘ └────────────┘
│
│ is_authorized()
▼
┌─────────────┐
│ AVP │
│ Policy Store│
└─────────────┘
Key Security Property : The Anthropic API key never leaves the Lambda function. The browser only sends conversation messages.
Never expose API keys in frontend code. Always use a backend proxy pattern for third-party API calls.
Lambda Configuration
The AgentFunction has special configuration:
AgentFunction :
Type : AWS::Serverless::Function
Properties :
FunctionName : avp-agent
CodeUri : lambda/
Handler : agent.lambda_handler
Description : "Agente IA que consulta AVP — proxy seguro hacia Anthropic API"
Timeout : 60 # Longer timeout for AI processing
Environment :
Variables :
POLICY_STORE_ID : !Ref PolicyStoreId
ANTHROPIC_API_KEY : !Ref AnthropicApiKey # Secure API key
Key Differences from Other Functions :
Timeout : 60 seconds (vs 30) to accommodate multiple AI calls and tool executions
Environment : Includes ANTHROPIC_API_KEY from CloudFormation NoEcho parameter
IAM : Same verifiedpermissions:IsAuthorized permission as CheckAccessFunction
Request/Response Flow
The browser sends conversation messages:
{
"messages" : [
{
"role" : "user" ,
"content" : "Can Alice read the Q4 Report?"
}
]
}
For multi-turn conversations, include the full message history:
{
"messages" : [
{
"role" : "user" ,
"content" : "Can Alice read the Q4 Report?"
},
{
"role" : "assistant" ,
"content" : "Yes, Alice can read it because..."
},
{
"role" : "user" ,
"content" : "What about Bob?"
}
]
}
{
"response" : "Sí, Alice puede leer el Q4 Report porque..." ,
"messages" : [
// Full conversation history including tool calls
]
}
The messages array includes internal tool calls for transparency and debugging.
The Agentic Loop
What is an Agentic Loop?
An agentic loop allows the AI to:
Reason about the user’s question
Decide if it needs to call a tool
Execute the tool
Incorporate results into its reasoning
Repeat until it has enough information
Respond to the user
This is more powerful than a simple function call because the AI decides when and how to use tools.
Implementation
The run_agent function (agent.py:94) implements the agentic loop:
def run_agent ( messages ):
"""
Ejecuta el loop agentico:
1. Llama a Claude con tools disponibles
2. Si Claude quiere usar una tool → ejecútala → devuelve resultado
3. Repite hasta end_turn
"""
current_messages = list (messages)
for _ in range ( 10 ): # max 10 iteraciones
# Call Anthropic API
payload = json.dumps({
"model" : "claude-haiku-4-5-20251001" ,
"max_tokens" : 1000 ,
"system" : system,
"tools" : tools,
"messages" : current_messages
}).encode()
req = urllib.request.Request(
"https://api.anthropic.com/v1/messages" ,
data = payload,
headers = {
"Content-Type" : "application/json" ,
"x-api-key" : ANTHROPIC_API_KEY ,
"anthropic-version" : "2023-06-01" ,
},
method = "POST"
)
with urllib.request.urlopen(req) as resp:
data = json.loads(resp.read())
current_messages.append({ "role" : "assistant" , "content" : data[ "content" ]})
# Check stop reason
if data[ "stop_reason" ] == "end_turn" :
# AI is done - extract response
text = " " .join(b[ "text" ] for b in data[ "content" ] if b[ "type" ] == "text" )
return { "response" : text, "messages" : current_messages}
if data[ "stop_reason" ] == "tool_use" :
# AI wants to use a tool
tool_results = []
for block in data[ "content" ]:
if block[ "type" ] != "tool_use" :
continue
inp = block[ "input" ]
logger.info( f "Tool call: { block[ 'name' ] } ( { inp } )" )
result = check_avp_access(inp[ "user" ], inp[ "action" ], inp[ "resource" ])
tool_results.append({
"type" : "tool_result" ,
"tool_use_id" : block[ "id" ],
"content" : json.dumps(result)
})
current_messages.append({ "role" : "user" , "content" : tool_results})
# Loop continues - AI will process tool results
return { "response" : "No pude completar la consulta." , "messages" : current_messages}
Loop Breakdown
Iteration 1: Initial Question
Input: “Can Alice read the Q4 Report?”
Claude reasoning: “I need to check AVP”
Stop reason: tool_use
Action: Call check_avp_access("alice", "Read", "Q4-Report-2024")
Iteration 2: Process Tool Result
Input: Tool result (ALLOW/DENY)
Claude reasoning: “Got the result, now I can answer”
Stop reason: end_turn
Action: Return natural language response
The loop can iterate up to 10 times, allowing Claude to make multiple tool calls for complex questions.
Anthropic API Integration
Model Selection
The agent uses Claude Haiku for fast, cost-effective responses:
"model" : "claude-haiku-4-5-20251001"
Why Haiku?
Fast response times (1-2 seconds)
Low cost ($0.25 per 1M input tokens)
Sufficient reasoning for authorization questions
Excellent tool use capabilities
For more complex reasoning, you could upgrade to:
claude-sonnet-4-5-20250514: Better reasoning, moderate cost
claude-opus-4-20250514: Best reasoning, higher cost
The request follows Anthropic’s Messages API format:
payload = {
"model" : "claude-haiku-4-5-20251001" ,
"max_tokens" : 1000 ,
"system" : system, # System prompt
"tools" : tools, # Tool definitions
"messages" : messages # Conversation history
}
Using urllib Instead of SDK
The code uses Python’s built-in urllib instead of the Anthropic SDK:
req = urllib.request.Request(
"https://api.anthropic.com/v1/messages" ,
data = payload,
headers = {
"Content-Type" : "application/json" ,
"x-api-key" : ANTHROPIC_API_KEY ,
"anthropic-version" : "2023-06-01" ,
},
method = "POST"
)
with urllib.request.urlopen(req) as resp:
data = json.loads(resp.read())
Why not use the SDK?
Avoids adding dependencies to Lambda
Reduces cold start time
Smaller deployment package
Simple HTTP calls are sufficient for this use case
Tools are defined in JSON schema format (agent.py:102):
tools = [{
"name" : "check_avp_access" ,
"description" : (
"Verifica en AWS Verified Permissions si un usuario puede ejecutar "
"una acción sobre un recurso. "
"Usuarios: alice (Analyst/Finance), bob (Admin/Finance), carol (Auditor/HR). "
"Acciones: Read, Edit, Delete. "
"Recursos: Q4-Report-2024, HR-Payroll-2024, Sales-Dashboard."
),
"input_schema" : {
"type" : "object" ,
"properties" : {
"user" : { "type" : "string" , "description" : "alice, bob, o carol" },
"action" : { "type" : "string" , "description" : "Read, Edit, o Delete" },
"resource" : { "type" : "string" , "description" : "Q4-Report-2024, HR-Payroll-2024, o Sales-Dashboard" }
},
"required" : [ "user" , "action" , "resource" ]
}
}]
Key Components :
Name : Identifies the tool (must be unique)
Description : Tells Claude when and how to use the tool. This is critical - Claude uses this to decide whether to call the tool.
Input Schema : Defines the parameters using JSON Schema
The description is crucial for tool use. Be specific about what the tool does and include examples of valid inputs.
When Claude wants to use the tool, the agent executes it:
if data[ "stop_reason" ] == "tool_use" :
tool_results = []
for block in data[ "content" ]:
if block[ "type" ] != "tool_use" :
continue
inp = block[ "input" ]
logger.info( f "Tool call: { block[ 'name' ] } ( { inp } )" )
result = check_avp_access(inp[ "user" ], inp[ "action" ], inp[ "resource" ])
tool_results.append({
"type" : "tool_result" ,
"tool_use_id" : block[ "id" ],
"content" : json.dumps(result)
})
current_messages.append({ "role" : "user" , "content" : tool_results})
The tool implementation (agent.py:41) is similar to the main CheckAccessFunction but returns a dict instead of HTTP response:
def check_avp_access ( user_id , action_id , resource_id ):
if user_id not in DEMO_USERS :
return { "error" : f "Usuario ' { user_id } ' no existe" }
if resource_id not in DEMO_RESOURCES :
return { "error" : f "Recurso ' { resource_id } ' no existe" }
user = DEMO_USERS [user_id]
resource = DEMO_RESOURCES [resource_id]
try :
response = avp_client.is_authorized(
policyStoreId = POLICY_STORE_ID ,
principal = { "entityType" : "FinancialApp::User" , "entityId" :user_id},
action = { "actionType" : "FinancialApp::Action" , "actionId" :action_id},
resource = { "entityType" : "FinancialApp::Document" , "entityId" :resource_id},
entities = { "entityList" :[ ... ]}
)
decision = response[ "decision" ]
return {
"decision" : decision,
"allowed" : decision == "ALLOW" ,
"user" : { ** user, "id" : user_id},
"action" : action_id,
"resource" : resource_id,
"resource_info" : resource,
"message" : (
f "✅ ACCESO PERMITIDO: { user[ 'name' ] } puede { action_id } en { resource_id } "
if decision == "ALLOW" else
f "🚫 ACCESO DENEGADO: { user[ 'name' ] } no tiene permisos para { action_id } en { resource_id } "
)
}
except Exception as e:
return { "error" : str (e)}
System Prompt
The system prompt guides Claude’s behavior (agent.py:122):
system = (
"Eres un agente de seguridad experto en AWS Verified Permissions. "
"Responde preguntas sobre permisos usando la herramienta check_avp_access. "
"NUNCA asumas el resultado — siempre verifica con la herramienta. "
"Si preguntan por múltiples usuarios o recursos, verifica cada combinación. "
"Explica brevemente por qué AVP tomó esa decisión (RBAC/ABAC/Cedar). "
"Sé conciso. Responde siempre en español."
)
Prompt Design Principles :
Role Definition : “Eres un agente de seguridad experto”
Tool Guidance : “Responde preguntas… usando la herramienta”
Safety Constraint : “NUNCA asumas el resultado”
Completeness : “Si preguntan por múltiples… verifica cada combinación”
Explanation : “Explica brevemente por qué”
Style : “Sé conciso. Responde siempre en español”
Good system prompts are specific about behavior, include constraints, and set expectations for output format.
Example Interactions
Simple Query
User : “Can Alice read the Q4 Report?”
Agent Process :
Call Anthropic API with user question
Claude decides to use check_avp_access
Execute: check_avp_access("alice", "Read", "Q4-Report-2024")
AVP returns: ALLOW
Claude synthesizes response
Response : “Sí, Alice puede leer el Q4 Report porque es del departamento Finance, igual que el documento, y su nivel de clearance (2) es suficiente para documentos confidenciales.”
Complex Multi-Step Query
User : “Compare Alice’s and Bob’s access to all Finance documents”
Agent Process :
Claude identifies need for multiple checks
Calls:
check_avp_access("alice", "Read", "Q4-Report-2024")
check_avp_access("alice", "Edit", "Q4-Report-2024")
check_avp_access("alice", "Delete", "Q4-Report-2024")
check_avp_access("bob", "Read", "Q4-Report-2024")
check_avp_access("bob", "Edit", "Q4-Report-2024")
check_avp_access("bob", "Delete", "Q4-Report-2024")
Synthesizes results
Response : “Alice puede leer el Q4 Report pero no editarlo ni borrarlo. Bob, como Admin del departamento Finance, tiene acceso completo: puede leer, editar y borrar el documento.”
Error Handling
User : “Can Charlie read the Q4 Report?”
Agent Process :
Call check_avp_access("charlie", "Read", "Q4-Report-2024")
Tool returns: {"error": "Usuario 'charlie' no existe"}
Claude handles error gracefully
Response : “El usuario ‘charlie’ no existe en el sistema. Los usuarios disponibles son: alice (Analyst), bob (Admin), y carol (Auditor).”
Claude can make multiple tool calls in a single stop:
for block in data[ "content" ]:
if block[ "type" ] == "tool_use" :
# Multiple tools can be called in one response
This reduces the number of round trips to Anthropic.
Token Usage
Typical token usage:
Simple query: ~200 input tokens, ~100 output tokens
Complex query: ~500 input tokens, ~300 output tokens
At Haiku pricing:
Input: $0.25 per 1M tokens
Output: $1.25 per 1M tokens
Cost per query: 0.0001 − 0.0001 - 0.0001 − 0.001 (very cheap)
Timeout Considerations
The 60-second timeout accommodates:
Up to 10 loop iterations
~3-5 seconds per Anthropic API call
Tool execution time (~50ms per AVP call)
Network latency
In practice, most queries complete in 5-10 seconds.
Security Considerations
API Key Protection
✅ Good : API key in Lambda environment variable
ANTHROPIC_API_KEY = os.environ[ "ANTHROPIC_API_KEY" ]
❌ Bad : API key in frontend code
const API_KEY = "sk-ant-..." ; // Never do this!
The agent validates inputs before calling AVP:
if user_id not in DEMO_USERS :
return { "error" : f "Usuario ' { user_id } ' no existe" }
if resource_id not in DEMO_RESOURCES :
return { "error" : f "Recurso ' { resource_id } ' no existe" }
Rate Limiting
Consider adding rate limiting for production:
from functools import lru_cache
import time
# Simple per-user rate limit
user_requests = {}
def check_rate_limit ( user_id , limit = 10 , window = 60 ):
now = time.time()
requests = user_requests.get(user_id, [])
requests = [r for r in requests if r > now - window]
if len (requests) >= limit:
raise RateLimitError( "Too many requests" )
requests.append(now)
user_requests[user_id] = requests
Prompt Injection
The system prompt includes safety guidance:
"NUNCA asumas el resultado — siempre verifica con la herramienta."
This prevents Claude from making up authorization decisions.
Debugging the Agent
The agent logs every tool call:
logger.info( f "Tool call: { block[ 'name' ] } ( { inp } )" )
Check CloudWatch Logs:
aws logs tail /aws/lambda/avp-agent --follow
Return Full Message History
The response includes the full conversation:
{
"response" : "..." ,
"messages" : [
{ "role" : "user" , "content" : "Can Alice read Q4?" },
{ "role" : "assistant" , "content" : [
{ "type" : "tool_use" , "name" : "check_avp_access" , ... }
]},
{ "role" : "user" , "content" : [
{ "type" : "tool_result" , "content" : "..." }
]},
{ "role" : "assistant" , "content" : "Yes, Alice can..." }
]
}
Use this to debug the agentic loop.
Test Locally
You can test the agent logic without deploying:
if __name__ == "__main__" :
# Mock environment
os.environ[ "POLICY_STORE_ID" ] = "ps-test"
os.environ[ "ANTHROPIC_API_KEY" ] = "sk-ant-..."
# Mock event
event = {
"body" : json.dumps({
"messages" : [{ "role" : "user" , "content" : "Test question" }]
})
}
result = lambda_handler(event, None )
print (result)
Extending the Agent
You could add tools for:
tools = [
{ "name" : "check_avp_access" , ... },
{ "name" : "list_user_permissions" , ... },
{ "name" : "explain_policy" , ... },
{ "name" : "suggest_policy_changes" , ... }
]
Multi-Language Support
Modify the system prompt:
"Responde en el idioma que el usuario utilice."
Streaming Responses
For longer responses, use Anthropic’s streaming API:
req = urllib.request.Request(
"https://api.anthropic.com/v1/messages" ,
headers = { "anthropic-beta" : "messages-2023-12-15" }
)
Then use API Gateway WebSockets to stream to the frontend.
Comparison: Agent vs Traditional UI
Aspect Traditional UI AI Agent Ease of Use Requires understanding of users/resources Natural language questions Flexibility Fixed set of queries Open-ended questions Explanation Raw ALLOW/DENY Contextual explanations Discovery User must know what to ask Agent can suggest related queries Complexity Simple queries only Can handle multi-step reasoning Cost Minimal (just Lambda) Small AI API cost (~$0.0001/query) Latency 50-100ms 2-5 seconds
Best Practices
Always Verify : Never let Claude assume - always call tools
Comprehensive System Prompts : Be specific about behavior and constraints
Good Tool Descriptions : Include examples and valid values
Error Handling : Handle tool errors gracefully
Logging : Log all tool calls for debugging
Rate Limiting : Prevent abuse in production
Timeout Tuning : Adjust based on complexity of queries
Cost Monitoring : Track Anthropic API usage
Next Steps
Architecture Overview Return to the architecture overview
Quick Start Deploy and test the agent