Backend architecture

The SvaraAI backend is a TypeScript-based Express.js server that orchestrates AI services and manages conversation data. It serves as the bridge between the frontend and external AI APIs.

Project structure

Backend/
├── controllers/          # Request handlers
│   └── saveEntry.ts     # Save conversation logic
├── routes/              # API route definitions
│   ├── entries.ts       # Entry management routes
│   ├── gemini.ts        # Gemini AI integration
│   └── hume.ts          # Hume AI batch analysis
├── utils/               # Utility functions
│   └── humeClient.ts    # Hume API authentication
├── data/                # File-based storage
│   └── entries.json     # Saved conversations
├── server.ts            # Main application entry point
├── package.json
└── tsconfig.json

Server setup

The main server configuration is defined in server.ts:

import express from 'express';
import dotenv from 'dotenv';
import cors from 'cors';
import path from 'path';

import entriesRouter from './routes/entries';
import geminiRouter from './routes/gemini';
import humeRouter from './routes/hume';

dotenv.config({ path: path.resolve(__dirname, '.env') });

const app = express();
const PORT = process.env.PORT || 5000;

// Middleware
app.use(cors());
app.use(express.json());

// Routes
app.use('/api', entriesRouter);
app.use('/api/gemini', geminiRouter);
app.use('/api/hume', humeRouter);

app.listen(PORT, () => {
  console.log(`Server running on PORT ${PORT}`);
}).on('error', (err: Error) => {
  console.error('Server failed to start:', err);
  process.exit(1);
});

The backend runs on port 5000 by default. You can customize this by setting the PORT environment variable.

API routes

SvaraAI exposes three main API route groups:

Entries route

Endpoint: POST /api/save-entry Saves conversation data to the local file system.

import express from "express";
import { saveEntry } from "../controllers/saveEntry";

const router = express.Router();

router.post("/save-entry", saveEntry);

export default router;

Request body

{
  "messages": [
    {
      "type": "user_message" | "assistant_message",
      "message": {
        "content": "Message text"
      },
      "models": {
        "prosody": {
          "scores": {
            "joy": 0.85,
            "sadness": 0.12,
            // ... other emotions
          }
        }
      },
      "receivedAt": "2026-03-04T10:30:00Z"
    }
  ]
}

Implementation details

The saveEntry controller (controllers/saveEntry.ts):

Filters for user and assistant messages only
Extracts the top 3 emotions from prosody scores
Checks for duplicate messages to prevent redundant storage
Appends new entry to entries.json
Maintains a maximum of 20 entries (rotating out oldest)

const processedMessages = messages
  .filter((msg) => msg.type === "user_message" || msg.type === "assistant_message")
  .map((msg) => {
    const emotions = msg.models?.prosody?.scores 
      ? getTop3Emotions(msg.models.prosody.scores)
      : [];

    return {
      type: msg.type,
      content: msg.message.content,
      emotions,
      timestamp: msg.receivedAt
    };
  });

Storage format

Conversations are stored in data/entries.json:

[
  {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "timestamp": 1709553000000,
    "messages": [
      {
        "type": "user_message",
        "content": "I'm feeling great today!",
        "emotions": [
          { "emotion": "joy", "score": 0.85 },
          { "emotion": "contentment", "score": 0.72 },
          { "emotion": "excitement", "score": 0.68 }
        ],
        "timestamp": "2026-03-04T10:30:00Z"
      }
    ]
  }
]

Gemini route

Endpoint: POST /api/gemini Generates emotional insights from conversation transcripts using Google Gemini 2.0 Flash.

router.post('/', async (req: Request, res: Response): Promise<void> => {
  const { transcript, emoData } = req.body;

  // Validate inputs
  if (!transcript) {
    res.status(400).json({ error: 'Transcript is required' });
    return;
  }

  const apiKey = process.env.GEMINI_API_KEY;
  const rawPrompt = process.env.GEMINI_PROMPT;

  // Format top 3 emotions
  const formattedEmoData = Object.entries(emoData)
    .sort(([, a], [, b]) => (b as number) - (a as number))
    .slice(0, 3)
    .map(([emotion, score]) => `${emotion}: ${((score as number) * 100).toFixed(1)}%`)
    .join('\\n');

  // Replace placeholders in prompt
  const prompt = rawPrompt
    .replace('{{transcript}}', transcript)
    .replace('{{emoData}}', formattedEmoData);

  // Call Gemini API
  const response = await fetch(
    `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=${apiKey}`,
    {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        contents: [{
          parts: [{ text: prompt }]
        }],
        generationConfig: {
          temperature: 0.3,
          topK: 20,
          topP: 0.8,
          maxOutputTokens: 100,
        }
      })
    }
  );

  const data = await response.json();
  res.json({
    response: data.candidates[0].content.parts[0].text,
    emotions: emoData
  });
});

Request body

{
  "transcript": "User: I had a great day\nAssistant: That's wonderful!\n...",
  "emoData": {
    "joy": 0.85,
    "contentment": 0.72,
    "excitement": 0.68,
    // ... other emotions
  }
}

Gemini configuration

The Gemini API is configured for consistent, concise analysis:

Model: gemini-2.0-flash (fast, efficient)
Temperature: 0.3 (focused, less creative)
Top-K: 20 (limits token sampling)
Top-P: 0.8 (nucleus sampling)
Max tokens: 100 (short, concise insights)

Prompt engineering

The prompt is defined in the GEMINI_PROMPT environment variable and uses placeholders:

Analyze this conversation transcript and emotional data:

Transcript:
{{transcript}}

Top Emotions:
{{emoData}}

Provide a brief emotional analysis in 2-3 sentences.

The backend replaces {{transcript}} and {{emoData}} before sending to Gemini.

Response format

{
  "response": "The conversation shows predominantly positive emotions with high joy and contentment. The user appears to be in a good mood and engaged positively throughout.",
  "emotions": {
    "joy": 0.85,
    "contentment": 0.72,
    // ... all emotions
  }
}

The Gemini API key must be set in the .env file as GEMINI_API_KEY. Without it, all requests to this endpoint will return a 500 error.

Hume route

Endpoint: POST /api/hume Triggers batch audio analysis using Hume AI’s API (currently unused by the frontend but available for future features).

import express from 'express';
import { getHumeAccessToken } from '../utils/humeClient';

const router = express.Router();

router.post('/', async (req, res): Promise<any> => {
  const { audioUrl } = req.body;
  
  if (!audioUrl) {
    return res.status(400).json({ error: 'audioUrl is required' });
  }

  // Validate URL format
  try {
    new URL(audioUrl);
  } catch {
    return res.status(400).json({ error: 'Invalid URL format' });
  }

  // Get OAuth token
  const token = await getHumeAccessToken();
  
  // Submit batch analysis job
  const response = await fetch('https://api.hume.ai/v0/batch/analyze-url', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${token}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      url: audioUrl,
      models: {
        language: {},
        prosody: {}
      },
    }),
  });
  
  const result = await response.json();
  res.json(result);
});

This endpoint is designed for batch processing of pre-recorded audio files. The frontend currently uses Hume’s real-time Voice SDK instead, but this endpoint can be useful for analyzing uploaded recordings.

Utilities

Hume authentication

The utils/humeClient.ts file handles OAuth authentication with Hume AI:

export const getHumeAccessToken = async (): Promise<string> => {
  const apiKey = process.env.VITE_HUME_API_KEY;
  const secretKey = process.env.HUME_SECRET_KEY;

  if (!apiKey || !secretKey) {
    throw new Error(
      'Missing required environment variables (VITE_HUME_API_KEY or HUME_SECRET_KEY)'
    );
  }

  const authString = `${apiKey}:${secretKey}`;
  const encoded = Buffer.from(authString).toString('base64');
  
  const res = await fetch('https://api.hume.ai/oauth2-cc/token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/x-www-form-urlencoded',
      Authorization: `Basic ${encoded}`,
    },
    body: new URLSearchParams({ grant_type: 'client_credentials' }).toString(),
  });
  
  const data = await res.json();
  return data.access_token;
};

This uses OAuth 2.0 client credentials flow to obtain an access token for Hume’s batch API.

Error handling

All routes implement consistent error handling:

try {
  // ... route logic
} catch (error) {
  console.error('[Route Error]', error);
  res.status(500).json({
    error: 'Unable to process your request at this time'
  });
}

Error messages sent to the client are generic to avoid leaking sensitive implementation details. Detailed errors are logged to the console for debugging.

Environment variables

The backend requires these environment variables:

Variable	Purpose	Required
`PORT`	Server port (default: 5000)	No
`GEMINI_API_KEY`	Google Gemini API authentication	Yes
`GEMINI_PROMPT`	Prompt template for insights	Yes
`VITE_HUME_API_KEY`	Hume AI API key	For batch analysis
`HUME_SECRET_KEY`	Hume AI secret key	For batch analysis

PORT=5000
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_PROMPT="Analyze this conversation...{{transcript}}...{{emoData}}"
VITE_HUME_API_KEY=your_hume_api_key
HUME_SECRET_KEY=your_hume_secret_key

Development workflow

Install dependencies

cd Backend
npm install

Set up environment variables

Create a .env file in the Backend/ directory with required variables.

Run development server

npm run dev

This uses ts-node-dev for hot reloading during development.

Build for production

npm run build
npm start

The build command compiles TypeScript to JavaScript in the dist/ folder.

Data persistence

SvaraAI currently uses file-based storage for simplicity:

File: Backend/data/entries.json
Format: JSON array of conversation entries
Rotation: Maximum 20 entries stored
Deduplication: Duplicate messages are filtered

File-based storage is not suitable for production use. For production deployments, migrate to a database like PostgreSQL, MongoDB, or Firebase.

Scaling considerations

For production deployment, consider:

Database migration: Replace file storage with a proper database
Rate limiting: Implement rate limiting on AI API endpoints to prevent abuse
Caching: Cache Gemini responses for identical transcripts to reduce API costs
Authentication: Add user authentication to protect API routes
Monitoring: Implement logging and error tracking (e.g., Sentry, LogRocket)
Load balancing: Use a reverse proxy like Nginx for horizontal scaling

Get Started

Core Features

Architecture

Integrations

Backend architecture

Project structure

Server setup

API routes

Entries route

Gemini route

Hume route

Utilities

Hume authentication

Error handling

Environment variables

Development workflow

Data persistence

Scaling considerations

Next steps

Frontend architecture

API reference

Build docs developers (and LLMs) love

Get Started

Core Features

Architecture

Integrations

​Project structure

​Server setup

​API routes

​Entries route

​Gemini route

​Hume route

​Utilities

​Hume authentication

​Error handling

​Environment variables

​Development workflow

​Data persistence

​Scaling considerations

​Next steps

Frontend architecture

API reference

Build docs developers (and LLMs) love

Project structure

Server setup

API routes

Entries route

Gemini route

Hume route

Utilities

Hume authentication

Error handling

Environment variables

Development workflow

Data persistence

Scaling considerations

Next steps