Skip to main content

Customer Service Chatbots

How a global SaaS company secured a high-volume support bot without sacrificing deflection rates.

Challenge

The support team used an AI chatbot to resolve tier-1 requests:
  • Billing and subscription questions
  • Account recovery and access issues
  • Product troubleshooting
Critical Requirements
  • Prevent jailbreaks that expose internal policies
  • Avoid sensitive data exposure in public channels
  • Maintain response quality across millions of chats

Solution

KoreShield scanned incoming messages and applied policy enforcement before any model call. Risky requests were routed to human agents.
import { Koreshield } from "koreshield-sdk";

const koreshield = new Koreshield({
  apiKey: process.env.KORESHIELD_API_KEY,
  sensitivity: "medium",
});

async function secureSupportChat(userId: string, message: string) {
  const scan = await koreshield.scan({
    content: message,
    userId,
    metadata: { channel: "support", tier: "public" },
  });

  if (scan.threat_detected) {
    return {
      handoff: true,
      reason: scan.threat_type,
    };
  }

  return await respondWithBot(message);
}

Threat Model

The team focused on common public-channel risks:

Jailbreaks

Requests for internal policies or system prompts

Impersonation

Account reset attempts without verification

Prompt Injection

Malicious content in attachments or chat history

Data Disclosure

Requests to disclose private customer data

Operational Controls

1

PII Masking

Redacted emails, addresses, and account IDs from logs and outputs
2

Policy Scopes

Separate rules for billing, security, and product support
3

Safe Response Templates

Constrained output for high-risk topics
4

Escalation

Automatic human handoff for blocked or ambiguous intents

Routing and Escalation

Low-risk intents stayed with the bot for faster resolution

Quality and Safety Loop

Weekly reviews of blocked requests helped adjust sensitivity and reduce false positives. The team also maintained a small allowlist of brand-safe phrases for common billing questions.
Best Practice: Regular review cycles help balance security with user experience. Start conservative and tune based on real-world data.

Results

Reduced Incidents

Eliminated prompt-injection incidents in public channels

Better Handoffs

Improved agent handoff accuracy for sensitive issues

High Deflection

Maintained deflection rates without compromising safety

Implementation Tips

Begin with stricter policies and relax based on false positive analysis. It’s easier to loosen controls than recover from a security incident.
Track the percentage of conversations escalated to humans. Sudden spikes may indicate policy issues or new attack patterns.
Have human agents flag incorrect bot responses to improve both the model and security policies over time.

Security

Threat detection features

Rate Limiting

Configure usage limits

Monitoring

Set up alerts

Build docs developers (and LLMs) love