Gemini AI API - Ai Studio

Overview

The Gemini AI API provides intelligent chat assistance for restaurant operations and audio transcription capabilities. It uses Google’s Gemini 2.5 Flash model to power conversational ordering, reservation booking, and voice-to-text features.

Chat Assistant

sendMessageToGemini

Sends a message to the Gemini AI assistant with conversation history.

export const sendMessageToGemini = async (
  history: ChatMessage[],
  actionLock: 'order' | 'reservation' | null = null
): Promise<{ text: string }>

history

ChatMessage[]

required

Array of conversation messages (user and bot)

Show ChatMessage structure

sender

MessageSender

required

Either MessageSender.USER or MessageSender.BOT

text

string

required

Message content

actionLock

'order' | 'reservation' | null

Locks the conversation context:

'order' - Assistant focuses only on completing an order
'reservation' - Assistant focuses only on completing a reservation
null - Normal conversation (default)

result

object

Show properties

text

string

AI assistant response text

AI Behavior:

Filters out previous JSON responses from history (prevents confusion)
Uses dynamic system instructions based on current business state
Adapts to business hours (different behavior when closed)
Generates structured JSON for orders and reservations when confirmed

import { sendMessageToGemini, MessageSender } from './services/geminiService';

const history: ChatMessage[] = [
  { sender: MessageSender.BOT, text: "Hola! ¿Quieres hacer un pedido o una reserva?" },
  { sender: MessageSender.USER, text: "Quiero hacer un pedido" },
  { sender: MessageSender.BOT, text: "¡Perfecto! ¿Qué te gustaría pedir?" },
  { sender: MessageSender.USER, text: "Una pizza muzzarella" }
];

const response = await sendMessageToGemini(history);
console.log('AI:', response.text);

With Action Lock:

// Lock conversation to order completion only
const response = await sendMessageToGemini(history, 'order');
// AI will refuse to discuss reservations until order is complete

Audio Transcription

transcribeAudio

Transcribes audio to text using Gemini’s multimodal capabilities.

export const transcribeAudio = async (
  base64Audio: string,
  mimeType: string
): Promise<string>

base64Audio

string

required

Base64-encoded audio data (without data URL prefix)

mimeType

string

required

Audio MIME type (e.g., “audio/webm”, “audio/mp3”, “audio/wav”)

transcription

string

Transcribed text from the audio

Supported Audio Formats:

WebM (audio/webm)
MP3 (audio/mp3)
WAV (audio/wav)
M4A (audio/m4a)
OGG (audio/ogg)

import { transcribeAudio } from './services/geminiService';

// Assuming you have audio from MediaRecorder
const audioBlob = new Blob([audioData], { type: 'audio/webm' });
const reader = new FileReader();

reader.onloadend = async () => {
  const base64 = reader.result.split(',')[1]; // Remove data URL prefix
  
  try {
    const transcription = await transcribeAudio(base64, 'audio/webm');
    console.log('User said:', transcription);
    // Use transcription as text input to chat
  } catch (error) {
    console.error('Transcription failed:', error);
  }
};

reader.readAsDataURL(audioBlob);

The transcription prompt instructs Gemini to return only the transcribed text without any additional commentary or formatting.

Netlify Functions

The AI functionality is also exposed via serverless Netlify functions for client-side applications.

POST /api/gemini

Netlify function for chat interactions.

POST /.netlify/functions/gemini
Content-Type: application/json

{
  "history": [
    { "role": "user", "parts": [{ "text": "Hola" }] },
    { "role": "model", "parts": [{ "text": "Hola!" }] }
  ],
  "systemInstruction": "You are a helpful assistant..."
}

history

array

Array of Gemini-formatted messages with role and parts

systemInstruction

string

required

System instruction prompt for the AI

response

object

text

string

AI response text

Status Codes:

200 - Success
400 - Missing history in request
405 - Method not allowed (only POST accepted)
500 - AI generation error

POST /api/transcribe

Netlify function for audio transcription.

POST /.netlify/functions/transcribe
Content-Type: application/json

{
  "audio": "base64EncodedAudioData",
  "mimeType": "audio/webm"
}

audio

string

required

Base64-encoded audio data

mimeType

string

required

Audio MIME type

response

object

text

string

Transcribed text

Status Codes:

200 - Success
400 - Missing audio data or mimeType
405 - Method not allowed
500 - Transcription error

const response = await fetch('/.netlify/functions/transcribe', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    audio: base64AudioData,
    mimeType: 'audio/webm'
  })
});

const { text } = await response.json();
console.log('Transcription:', text);

AI System Instructions

The AI assistant uses dynamic system instructions that adapt to:

Business State Context

Current date/time - Understands “today”, “tomorrow”, relative times
Open/Closed status - Different behavior based on business hours
Menu availability - Real-time menu from product cache
Table availability - Real-time reservation and order data
Reservation settings - Minimum booking times, durations, etc.

Conversation Modes

Normal Mode (No Lock)

When Business is Open:

Offers both ordering and reservation services
Provides menu information
Shows business hours
Guides user through either flow

When Business is Closed:

Informs user business is closed for orders
Still accepts reservations for future dates
Shows next opening time

Order Lock Mode

Single focus: Complete the current order
Refuses to start reservations
Collects: products, quantities, customer info, delivery address, payment method
Payment rules:
- Delivery orders: Only “Transferencia” (Transfer)
- Pickup orders: “Efectivo” (Cash) or “Credito” (Credit)
Confirms order details before generating JSON

Reservation Lock Mode

Single focus: Complete the current reservation
Refuses to take orders
Collects: name, phone, guests, date, time
Validates against capacity and availability
Checks minimum booking time requirements
Prevents past reservations
Confirms details before generating JSON

JSON Response Formats

When the user confirms an order or reservation, the AI generates structured JSON:

Order JSON

{
  "intent": "ORDER",
  "customer": {
    "name": "Juan Pérez",
    "phone": "+5491123456789",
    "address": "Av. Corrientes 1234, Buenos Aires"
  },
  "items": [
    {
      "name": "Pizza Muzzarella",
      "quantity": 2,
      "price": 9200
    },
    {
      "name": "Pizza Napolitana",
      "quantity": 1,
      "price": 9700
    }
  ],
  "total": 28100,
  "type": "delivery",
  "paymentMethod": "Transferencia"
}

Reservation JSON

{
  "intent": "RESERVATION",
  "customerName": "María González",
  "customerPhone": "+5491145678901",
  "guests": 4,
  "date": "2024-03-10",
  "time": "20:30"
}

The application detects these JSON blocks in the AI response and processes them accordingly. JSON is wrapped in markdown code blocks: ```json ... ```

AI Prompt Architecture

const generateMenuForPrompt = (): string => {
  // Reads from product cache
  // Groups by category
  // Includes name, price, description
  // Returns JSON string
}

Schedule Formatting

const formatScheduleForPrompt = (): string => {
  // Reads business schedule
  // Formats as readable text
  // Example: "Lunes: de 18:00 a 23:00"
}

Table Information

const generateTablesForPrompt = (): string => {
  // Lists reservable tables with capacity
  // Calculates total capacity
  // Warns AI about capacity limits
}

Real-Time Availability

const generateAvailabilityForPrompt = (): string => {
  // Lists occupied/reserved tables
  // Shows time ranges
  // Instructs AI to avoid conflicts
}

TypeScript Types

MessageSender Enum

enum MessageSender {
  USER = 'user',
  BOT = 'bot'
}

ChatMessage Interface

interface ChatMessage {
  sender: MessageSender;
  text: string;
}

Gemini API Request Format

interface GeminiMessage {
  role: 'user' | 'model';
  parts: Array<{ text: string }>;
}

interface GeminiRequest {
  model: 'gemini-2.5-flash';
  contents: GeminiMessage[];
  config: {
    systemInstruction: string;
  };
}

Audio Transcription Request

interface TranscriptionRequest {
  parts: [
    {
      inlineData: {
        mimeType: string;
        data: string; // base64
      };
    },
    {
      text: string; // Transcription prompt
    }
  ];
}

Configuration

Environment Variables

API_KEY=your_google_ai_api_key_here

The API_KEY must be set in environment variables (.env for development, Netlify environment settings for production).

Google GenAI Initialization

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY! });

Error Handling

Service Layer Errors

try {
  const response = await sendMessageToGemini(history);
  // Process response
} catch (error) {
  console.error("AI Error:", error instanceof Error ? error.message : error);
  // Fallback behavior
}

Netlify Function Errors

Chat Function:

{
  "error": "Failed to get response from assistant."
}

Transcribe Function:

{
  "error": "Failed to transcribe audio."
}

Best Practices

Conversation Management

Filter JSON from history - The service automatically filters previous JSON responses
Use action locks - Lock context when user commits to an action
Validate AI output - Always parse and validate generated JSON
Handle errors gracefully - Provide fallback options if AI fails

Audio Transcription

Optimize audio quality - Better audio = better transcription
Limit audio length - Keep recordings under 1 minute for faster processing
Handle silence - Check for empty transcriptions
Use appropriate formats - WebM and MP3 work well

Performance

Cache AI responses - For repeated questions
Debounce requests - Avoid rapid-fire API calls
Show loading states - AI responses can take 1-3 seconds
Implement timeouts - Set reasonable timeout limits

Complete Example

Voice-Enabled Chat Assistant

import { 
  sendMessageToGemini, 
  transcribeAudio,
  MessageSender 
} from './services/geminiService';

class ChatAssistant {
  private history: ChatMessage[] = [];
  private actionLock: 'order' | 'reservation' | null = null;
  
  async initialize() {
    // Get initial greeting
    const greeting = await sendMessageToGemini([]);
    this.history.push({
      sender: MessageSender.BOT,
      text: greeting.text
    });
    return greeting.text;
  }
  
  async sendTextMessage(userText: string) {
    // Add user message
    this.history.push({
      sender: MessageSender.USER,
      text: userText
    });
    
    // Get AI response
    const response = await sendMessageToGemini(this.history, this.actionLock);
    
    // Add bot response
    this.history.push({
      sender: MessageSender.BOT,
      text: response.text
    });
    
    // Check for JSON intent
    const jsonMatch = response.text.match(/```json\s*([\s\S]*?)\s*```/);
    if (jsonMatch) {
      const intent = JSON.parse(jsonMatch[1]);
      return { type: 'intent', data: intent, text: response.text };
    }
    
    return { type: 'message', text: response.text };
  }
  
  async sendVoiceMessage(audioBlob: Blob) {
    // Convert audio to base64
    const base64 = await this.blobToBase64(audioBlob);
    const base64Data = base64.split(',')[1];
    
    // Transcribe
    const transcription = await transcribeAudio(
      base64Data,
      audioBlob.type
    );
    
    // Send as text message
    return this.sendTextMessage(transcription);
  }
  
  setActionLock(action: 'order' | 'reservation' | null) {
    this.actionLock = action;
  }
  
  reset() {
    this.history = [];
    this.actionLock = null;
  }
  
  private async blobToBase64(blob: Blob): Promise<string> {
    return new Promise((resolve, reject) => {
      const reader = new FileReader();
      reader.onloadend = () => resolve(reader.result as string);
      reader.onerror = reject;
      reader.readAsDataURL(blob);
    });
  }
}

// Usage
const chat = new ChatAssistant();

// Initialize
const greeting = await chat.initialize();
console.log('Bot:', greeting);

// Text message
const response1 = await chat.sendTextMessage("Quiero hacer un pedido");
console.log('Bot:', response1.text);

// Lock to order mode
chat.setActionLock('order');

// Voice message
const response2 = await chat.sendVoiceMessage(audioBlob);
if (response2.type === 'intent' && response2.data.intent === 'ORDER') {
  console.log('Order confirmed:', response2.data);
  // Process order...
}

API

​Overview

​Chat Assistant

​sendMessageToGemini

​Audio Transcription

​transcribeAudio

​Netlify Functions

​POST /api/gemini

​POST /api/transcribe

​AI System Instructions

​Business State Context

​Conversation Modes

​Normal Mode (No Lock)

​Order Lock Mode

​Reservation Lock Mode

​JSON Response Formats

​Order JSON

​Reservation JSON

​AI Prompt Architecture

​Menu Generation

​Schedule Formatting

​Table Information

​Real-Time Availability

​TypeScript Types

​MessageSender Enum

​ChatMessage Interface

​Gemini API Request Format

​Audio Transcription Request

​Configuration

​Environment Variables

​Google GenAI Initialization

​Error Handling

​Service Layer Errors

​Netlify Function Errors

​Best Practices

​Conversation Management

​Audio Transcription

​Performance

​Complete Example

​Voice-Enabled Chat Assistant

Build docs developers (and LLMs) love

Overview

Chat Assistant

sendMessageToGemini

Audio Transcription

transcribeAudio

Netlify Functions

POST /api/gemini

POST /api/transcribe

AI System Instructions

Business State Context

Conversation Modes

Normal Mode (No Lock)

Order Lock Mode

Reservation Lock Mode

JSON Response Formats

Order JSON

Reservation JSON

AI Prompt Architecture

Menu Generation

Schedule Formatting

Table Information

Real-Time Availability

TypeScript Types

MessageSender Enum

ChatMessage Interface

Gemini API Request Format

Audio Transcription Request

Configuration

Environment Variables

Google GenAI Initialization

Error Handling

Service Layer Errors

Netlify Function Errors

Best Practices

Conversation Management

Audio Transcription

Performance

Complete Example

Voice-Enabled Chat Assistant