Sonore Phone Agent uses a dynamic instruction system that allows each tenant to customize how the AI agent behaves during calls. Instructions control the agent’s personality, capabilities, and conversation flow.
Overview
The instruction system has two main components:
- Greeting: The initial message spoken when the call starts
- Instructions: System prompt that defines the agent’s behavior
Both are loaded dynamically from the database and cached for performance.
Data Model
Tenant Prompt State
Each tenant has a state document that points to active prompts:
{
"_id": "acme-corp",
"active": {
"greeting_id": "65a1b2c3d4e5f6a7b8c9d0e1",
"instruction_id": "65a1b2c3d4e5f6a7b8c9d0e2"
},
"updated_at": "2024-01-15T10:30:00Z"
}
Prompt Texts
Prompt content is stored separately for reusability:
{
"_id": "65a1b2c3d4e5f6a7b8c9d0e2",
"tenant_id": "acme-corp",
"prompt_text": "You are a helpful customer service agent for Acme Corporation...",
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:30:00Z"
}
Active Instructions Model
The resolved instructions are represented as:
class ActiveInstructions:
tenant_id: str
greeting_text: str
instruction_text: str
greeting_id: str | None
instruction_id: str | None
updated_at: datetime | None
Instruction Loading
The InstructionReader service handles loading and caching instructions.
Service Initialization
# From: src/apps/calls/app/instructions_service.py:49-67
class InstructionReader:
def __init__(
self,
client=mongo_client,
*,
state_collection: str = "tenant-prompt-state",
texts_collection: str = "prompt-texts",
base_greeting: str = BASELINE_GREETING,
ttl_seconds: float = 3600,
) -> None:
if client is None:
raise ValueError("InstructionReader: mongo client is None")
self.client = client
self.state_collection = state_collection
self.texts_collection = texts_collection
self.base_greeting = base_greeting
self._ttl_seconds = ttl_seconds
self._cache: dict[str, CacheEntry] = {}
self._locks: dict[str, asyncio.Lock] = {}
Loading Process
Instructions are loaded through a multi-step process:
Check Cache
First check if valid cached instructions exist for the tenant
Acquire Lock
If cache miss, acquire a per-tenant lock to prevent duplicate loads
Load Pointers
Fetch the active prompt pointers from tenant-prompt-state
Fetch Texts
Load the actual prompt texts from prompt-texts collection
Apply Fallbacks
Use baseline greeting if custom greeting is missing
Update Cache
Store the resolved instructions in cache with TTL
Full Loading Implementation
# From: src/apps/calls/app/instructions_service.py:96-256
async def get_prompt_by_tenant(self, tenant_id: str) -> ActiveInstructions:
# Check cache first
try:
cached = self._get_cached(tenant_id)
if cached is not None:
return cached
except CacheEntryExpiredError:
logger.info("Cache entry expired for tenant_id=%s", tenant_id)
async with self._lock_for_tenant(tenant_id):
# Re-check cache inside lock
try:
cached = self._get_cached(tenant_id)
if cached is not None:
return cached
except CacheEntryExpiredError:
logger.info("Cache entry expired for tenant_id=%s", tenant_id)
try:
prompt_pointer = await self.get_active_prompts_by_tenant(tenant_id)
except (PyMongoError, ServerSelectionTimeoutError) as e:
raise InstructionsDBError(
tenant_id=tenant_id,
reason="state_read_failed",
operation="get_active_prompts_by_tenant",
cause=e,
) from e
if prompt_pointer is None:
raise TenantNotConfiguredError(
tenant_id=tenant_id,
reason="state_doc_missing_or_invalid",
context={"collection": self.state_collection},
)
# Instruction pointer is mandatory for prod correctness
if not prompt_pointer.instruction_id:
raise TenantNotConfiguredError(
tenant_id=tenant_id,
reason="instruction_pointer_missing",
context={"collection": self.state_collection},
)
active_instructions = ActiveInstructions(
tenant_id=tenant_id,
greeting_text="",
instruction_text="",
updated_at=None,
)
# Fetch greeting (soft-fail)
greeting: PromptText | None = None
if prompt_pointer.greeting_id:
try:
greeting = await self.get_prompt_text_by_id(
prompt_id=prompt_pointer.greeting_id,
tenant_id=tenant_id,
)
except (PyMongoError, ServerSelectionTimeoutError) as e:
raise InstructionsDBError(
tenant_id=tenant_id,
reason="greeting_read_failed",
operation="get_prompt_text_by_id",
context={"prompt_id": prompt_pointer.greeting_id},
cause=e,
) from e
# Fetch instruction (hard-fail if missing/invalid/empty)
instruction_prompt: PromptText | None = None
try:
instruction_prompt = await self.get_prompt_text_by_id(
prompt_id=prompt_pointer.instruction_id,
tenant_id=tenant_id,
)
except (PyMongoError, ServerSelectionTimeoutError) as e:
raise InstructionsDBError(
tenant_id=tenant_id,
reason="instruction_read_failed",
operation="get_prompt_text_by_id",
context={"prompt_id": prompt_pointer.instruction_id},
cause=e,
) from e
if (
instruction_prompt is None
or not (instruction_prompt.prompt_text or "").strip()
):
raise InstructionsMissingError(
tenant_id=tenant_id,
reason=(
"instruction_doc_not_found"
if instruction_prompt is None
else "instruction_text_empty"
),
greeting_id=prompt_pointer.greeting_id,
instruction_id=prompt_pointer.instruction_id,
context={"texts_collection": self.texts_collection},
)
# Apply greeting (soft-fail -> baseline)
if greeting is None or not (greeting.prompt_text or "").strip():
if prompt_pointer.greeting_id:
logger.warning(
"Greeting missing/empty; using baseline greeting. tenant_id=%s prompt_id=%s",
tenant_id,
prompt_pointer.greeting_id,
)
else:
logger.warning(
"Greeting pointer missing; using baseline greeting. tenant_id=%s",
tenant_id,
)
active_instructions.greeting_text = self.base_greeting
active_instructions.greeting_id = None
else:
active_instructions.greeting_text = greeting.prompt_text
active_instructions.greeting_id = (
str(greeting.id) if greeting.id is not None else None
)
# Apply instruction (already validated non-empty)
active_instructions.instruction_text = instruction_prompt.prompt_text
active_instructions.instruction_id = (
str(instruction_prompt.id)
if instruction_prompt.id is not None
else None
)
# Set cache value
self._set_cache(tenant_id, active_instructions)
return active_instructions
Caching Strategy
The instruction system uses an in-memory cache with TTL to balance performance and freshness.
Cache Structure
# From: src/apps/calls/app/instructions_service.py:29-32
@dataclass(frozen=True, slots=True)
class CacheEntry:
value: ActiveInstructions
expires_at: float
Cache Operations
# From: src/apps/calls/app/instructions_service.py:69-94
def _lock_for_tenant(self, tenant_id: str) -> asyncio.Lock:
lock = self._locks.get(tenant_id)
if lock is None:
lock = asyncio.Lock()
self._locks[tenant_id] = lock
return lock
def invalidate_cache_for_tenant(self, tenant_id: str) -> None:
self._cache.pop(tenant_id, None)
def _get_cached(self, tenant_id: str) -> ActiveInstructions | None:
entry = self._cache.get(tenant_id)
if entry is None:
logger.debug("instructions_cache_miss tenant_id=%s", tenant_id)
return None
if time.monotonic() >= entry.expires_at:
self._cache.pop(tenant_id, None)
logger.debug("instructions_cache_expired tenant_id=%s", tenant_id)
raise CacheEntryExpiredError(tenant_id=tenant_id)
logger.debug("instructions_cache_hit tenant_id=%s", tenant_id)
return entry.value
def _set_cache(self, tenant_id: str, instructions: ActiveInstructions) -> None:
expires_at = time.monotonic() + self._ttl_seconds
self._cache[tenant_id] = CacheEntry(value=instructions, expires_at=expires_at)
The default cache TTL is 1 hour (3600 seconds). Instructions are cached per-tenant with independent expiration.
Double-Checked Locking
The cache uses double-checked locking to prevent race conditions:
- Check cache without lock (fast path)
- If miss, acquire per-tenant lock
- Re-check cache with lock held (another thread may have loaded)
- If still miss, load from database
- Update cache and release lock
Error Handling
The instruction system has robust error handling for various failure modes.
Error Categories
Thrown when tenant has no configuration:
if prompt_pointer is None:
raise TenantNotConfiguredError(
tenant_id=tenant_id,
reason="state_doc_missing_or_invalid",
context={"collection": self.state_collection},
)
Resolution: Call is rejected. Ensure tenant has a document in tenant-prompt-state.
InstructionsMissingError
Thrown when instruction text is missing or empty:
if (
instruction_prompt is None
or not (instruction_prompt.prompt_text or "").strip()
):
raise InstructionsMissingError(
tenant_id=tenant_id,
reason="instruction_text_empty",
greeting_id=prompt_pointer.greeting_id,
instruction_id=prompt_pointer.instruction_id,
context={"texts_collection": self.texts_collection},
)
Resolution: Call is rejected. Ensure the instruction document exists and has non-empty text.
InstructionsDBError
Thrown when database operations fail:
try:
prompt_pointer = await self.get_active_prompts_by_tenant(tenant_id)
except (PyMongoError, ServerSelectionTimeoutError) as e:
raise InstructionsDBError(
tenant_id=tenant_id,
reason="state_read_failed",
operation="get_active_prompts_by_tenant",
cause=e,
) from e
Resolution: Call may use fallback prompts or be rejected, depending on configuration.
Fallback Mechanism
When the database is unavailable, the system can use fallback prompts:
# From: src/apps/calls/api/v1/endpoints/openai_webhook.py:390-407
except InstructionsDBError as e:
log_event(
logging.ERROR, "instructions_db_error", call_id, error=e.to_log_dict()
)
await metrics_store.record_instructions_db_error(
call_id=call_id, tenant_id=tenant_id
)
# Proceed with baseline fallback prompts (keep call service available during DB outage)
used_fallback = True
instructions = ActiveInstructions(
tenant_id=tenant_id,
greeting_text=DOWNTIME_GREETING,
instruction_text=DOWNTIME_PROMPT,
greeting_id=None,
instruction_id=None,
updated_at=None,
)
Fallback prompts are generic and may not reflect tenant-specific requirements. Monitor fallback usage in production.
Soft vs Hard Failures
Greeting (Soft Failure):
- If greeting is missing, use baseline greeting
- Call continues normally
- Warning logged
Instruction (Hard Failure):
- If instruction is missing, reject call
- InstructionsMissingError raised
- Call cannot proceed without instructions
Retry Logic
Database operations use retry logic for transient failures:
# From: src/apps/calls/app/instructions_service.py:258-312
@retry(
max_attempts=3,
delay=0.25,
exceptions=(
PyMongoError,
ServerSelectionTimeoutError,
TimeoutError,
ConnectionError,
),
return_none_on_fail=True,
retry_on_none=False,
)
async def get_prompt_text_by_id(
self, prompt_id: str, tenant_id: str
) -> PromptText | None:
"""
Retries ONLY on DB/network exceptions.
- Invalid ObjectId: returns None immediately (no retry triggered).
- Validation error: returns None immediately (no retry triggered).
- Not found: returns None (will be retried by decorator; see note below).
"""
oid = to_object_id(prompt_id)
if oid is None:
logger.error(
"Invalid prompt_id (not an ObjectId): tenant_id=%s prompt_id=%s",
tenant_id,
prompt_id,
)
return None
try:
doc = await fetch_from_mongodb(
query={"_id": oid, "tenant_id": tenant_id},
client=self.client,
collection=self.texts_collection,
)
except (PyMongoError, ServerSelectionTimeoutError):
raise
except Exception as e:
# Let retry treat this as transient by raising
raise PyMongoError(str(e)) from e
if not doc:
return None
try:
return PromptText.model_validate(doc)
except ValidationError as e:
logger.error(
"Invalid prompt document shape for tenant_id=%s prompt_id=%s error=%s",
tenant_id,
prompt_id,
e,
)
return None
Usage in Call Session
Instructions are loaded during webhook processing and passed to the call session:
# From: src/apps/calls/api/v1/endpoints/openai_webhook.py:340-407
try:
instruction_reader: InstructionReader = request.app.state.instruction_reader
instructions = await instruction_reader.get_prompt_by_tenant(tenant_id)
except TenantNotConfiguredError as e:
log_event(logging.ERROR, "tenant_not_configured", str(e))
await metrics_store.record_reject_tenant_not_configured(
call_id=call_id, tenant_id=tenant_id
)
try:
await openai_calls_service.reject_call(
call_id, idempotency_key=f"reject_tenant_not_configured_{webhook_id}"
)
except Exception as reject_e:
log_event(
logging.ERROR, "tenant_not_configured_reject_failed", str(reject_e)
)
finally:
await _release_pending_capacity_state(request, call_id)
return JSONResponse(
status_code=status.HTTP_200_OK,
content={"ok": True, "rejected": "tenant_not_configured"},
)
The instructions are then sent to the AI during session initialization:
# Instructions are passed to CallSession
session = CallSession(
call_id=call_id,
db_client=self.db_client,
caller_number=caller_number,
tenant_id=tenant_id,
instructions=instructions, # ActiveInstructions object
cfg=cfg,
tools_build=tools_build,
tool_executor=tool_executor,
metrics_store=self.metrics_store,
)
Best Practices
Instruction Writing
Be Specific
Clearly define the agent’s role, capabilities, and limitations
Include Context
Provide relevant business context and brand voice guidelines
Define Boundaries
Specify what the agent should and shouldn’t do
Test Variations
Test instructions with different call scenarios
Greeting Design
- Keep greetings concise (1-2 sentences)
- Include essential information only
- Match the brand’s tone and language
- Consider multilingual requirements
Cache Management
- Set appropriate TTL based on update frequency
- Invalidate cache when updating instructions
- Monitor cache hit rates for performance tuning
Error Handling
- Always provide fallback instructions for critical tenants
- Monitor InstructionsDBError occurrences
- Set up alerts for instruction loading failures
Updating Instructions
Update Process
Create New Prompt Text
Insert a new document in prompt-texts with updated content
Update State Pointer
Update the active field in tenant-prompt-state to reference the new prompt
Invalidate Cache
Optionally invalidate the cache to apply changes immediately to new calls
Monitor
Monitor new calls to verify the updated instructions are working as expected
Gradual Rollout
For major instruction changes:
- Test new instructions with a subset of calls
- Monitor metrics and call quality
- Gradually increase adoption
- Keep old instructions available for rollback
Monitoring
Key Metrics
- Cache Hit Rate: Percentage of instruction loads served from cache
- Cache Expiry Rate: Frequency of cache TTL expiration
- Load Failures: Count of instruction loading errors by type
- Fallback Usage: Frequency of fallback prompt usage
Logging
All instruction operations are logged:
logger.debug("instructions_cache_miss tenant_id=%s", tenant_id)
logger.debug("instructions_cache_hit tenant_id=%s", tenant_id)
logger.warning(
"Greeting missing/empty; using baseline greeting. tenant_id=%s",
tenant_id,
)
Next Steps