Skip to main content

Overview

The ChatService handles AI-powered chat message processing for the Gima chatbot. It integrates with GROQ AI models, provides rate limiting, request validation, and streaming text generation with tool support.

Key Features

  • AI Integration: Uses GROQ models (llama-3.1-8b-instant) for chat generation
  • Rate Limiting: IP-based rate limiting to prevent abuse
  • Request Validation: Zod schema validation for incoming messages
  • Message Sanitization: Context injection and tool result summarization
  • Streaming Support: Real-time streaming responses with AI SDK
  • Tool Execution: Supports chat tools for querying backend data
  • History Management: Maintains conversation context (last 8 messages)

Class Definition

export class ChatService {
  constructor(dependencies?: Partial<ChatServiceDependencies>)
  async processMessage(rawBody: unknown, clientIP: string | null): Promise<StreamTextResult>
}

Constructor

Parameters

dependencies
Partial<ChatServiceDependencies>
Dependency injection for testing and customization

Example

import { ChatService } from '@/app/lib/services/chat-service';

// Default configuration
const chatService = new ChatService();

// With custom dependencies (for testing)
const testService = new ChatService({
  logger: mockLogger,
  rateLimiter: mockRateLimiter,
  modelProvider: mockModelProvider,
});

Methods

processMessage()

Processes a chat message request through the AI pipeline with rate limiting, validation, and streaming.
async processMessage(
  rawBody: unknown,
  clientIP: string | null
): Promise<StreamTextResult>

Parameters

rawBody
unknown
required
Raw JSON body from the chat request. Must conform to chatRequestSchema:
{
  messages: CoreMessage[],  // Conversation history
  model: string              // Model ID (e.g., 'llama-3.1-8b-instant')
}
clientIP
string | null
required
Client’s IP address for rate limiting. Pass null to skip rate limiting.

Returns

result
StreamTextResult
AI SDK streaming result object with:
  • textStream: Readable stream of generated text
  • usage: Token usage information
  • finishReason: Completion status

Throws

RateLimitError
Error
Thrown when client exceeds rate limit. Contains retryAfter seconds.
ValidationError
Error
Thrown when request body fails validation. Contains Zod error details.

Processing Pipeline

  1. Rate Limiting: Checks IP-based limits (if clientIP provided)
  2. Validation: Validates request against Zod schema
  3. Sanitization: Cleans messages and injects tool context
  4. History Truncation: Keeps last 8 messages to stay within token limits
  5. AI Generation: Streams response with tool support (max 5 steps)

Example

import { ChatService } from '@/app/lib/services/chat-service';

const chatService = new ChatService();

try {
  const result = await chatService.processMessage(
    {
      messages: [
        { role: 'user', content: '¿Cuáles son los mantenimientos pendientes?' }
      ],
      model: 'llama-3.1-8b-instant'
    },
    '192.168.1.1'
  );

  // Stream the response
  for await (const chunk of result.textStream) {
    console.log(chunk);
  }
} catch (error) {
  if (error instanceof RateLimitError) {
    console.log(`Rate limited. Retry after ${error.retryAfter}s`);
  } else if (error instanceof ValidationError) {
    console.log('Invalid request:', error.details);
  }
}

Error Handling

RateLimitError

Thrown when client exceeds the configured rate limit.
class RateLimitError extends Error {
  retryAfter: number;  // Seconds until next allowed request
}

ValidationError

Thrown when request body fails Zod validation.
class ValidationError extends Error {
  details: ZodIssue[];  // Detailed validation errors
}

Configuration

System Prompt

Defined in @/app/config as SYSTEM_PROMPT. Configures the AI assistant’s behavior and role.

Message History Limit

const MAX_HISTORY_MESSAGES = 8;
Limits conversation history to prevent exceeding GROQ’s 6000 TPM (tokens per minute) limit.

Tool Integration

Chat tools are defined in @/app/lib/ai/tools/chat-tools and include:
  • Asset queries
  • Maintenance schedule lookups
  • Spare parts inventory
  • Report generation

Step Limit

stopWhen: stepCountIs(5)
Limits multi-step tool execution to 5 steps maximum.

Usage in API Routes

// app/api/chat/route.ts
import { ChatService } from '@/app/lib/services/chat-service';
import { headers } from 'next/headers';

export async function POST(request: Request) {
  const chatService = new ChatService();
  const body = await request.json();
  
  // Extract client IP
  const headersList = headers();
  const clientIP = headersList.get('x-forwarded-for')?.split(',')[0] || null;

  try {
    const result = await chatService.processMessage(body, clientIP);
    return result.toDataStreamResponse();
  } catch (error) {
    if (error instanceof RateLimitError) {
      return new Response('Too many requests', {
        status: 429,
        headers: { 'Retry-After': String(error.retryAfter) }
      });
    }
    throw error;
  }
}

Testing

The service supports dependency injection for easy testing:
import { ChatService } from '@/app/lib/services/chat-service';
import { vi } from 'vitest';

const mockLogger = {
  info: vi.fn(),
  error: vi.fn(),
};

const mockRateLimiter = {
  checkLimit: vi.fn(() => true),
  getRetryAfter: vi.fn(() => 0),
};

const testService = new ChatService({
  logger: mockLogger,
  rateLimiter: mockRateLimiter,
});

// Run tests...

Backend API Service

HTTP client for Laravel backend data

Chat Tools

Available tools for chat interactions

Rate Limiter

Rate limiting implementation

Schemas

Request validation schemas

Build docs developers (and LLMs) love