Skip to main content

Overview

The jøsh platform uses a combination of OpenAI and Mistral models for different AI tasks. Model selection is based on task requirements:
  • OpenAI GPT-4o-mini: Fast, cost-effective for structured extraction and vision tasks
  • Mistral Medium: Strong natural language understanding for scheduling negotiations

Model Configuration

All AI calls use the Vercel AI SDK with a unified gateway:
AI_GATEWAY_API_KEY=your_api_key_here
The AI gateway key works with both OpenAI and Mistral models through a single unified endpoint.

Model Usage by Feature

Profile Structuring

Model: openai/gpt-4o-mini Purpose: Convert free-text onboarding responses into structured JSON with normalized badge phrases.
src/lib/profileStructuring.ts
const MODEL = "openai/gpt-4o-mini";

const { text } = await generateText({
  model: MODEL,
  prompt: `Convert this dating onboarding profile into structured JSON...`,
  maxOutputTokens: 700,
});
Token Budget: 700 output tokens Why GPT-4o-mini?
  • Excellent at structured output formatting
  • Fast response times for real-time onboarding
  • Cost-effective for high-volume profile processing

Scheduling System

Model: mistral/mistral-medium Purpose: Analyze natural language scheduling responses and generate date/time proposals.
src/lib/tpoScheduling.ts
const MODEL = "mistral/mistral-medium";

// Propose initial time slot
const { text } = await generateText({
  model: MODEL,
  prompt: `You are scheduling a first date for a dating app...`,
  maxOutputTokens: 30,
});

// Analyze scheduling response
const { text } = await generateText({
  model: MODEL,
  prompt: `You are analyzing a user's message in a dating app scheduling conversation...`,
  maxOutputTokens: 160,
});

// Suggest date spot
const { text } = await generateText({
  model: MODEL,
  prompt: `You are suggesting a first date venue for two people on a dating app...`,
  maxOutputTokens: 80,
});
Token Budgets:
  • Time proposal: 30 tokens
  • Response analysis: 160 tokens
  • Venue suggestion: 80 tokens
Why Mistral Medium?
  • Superior natural language understanding for ambiguous responses
  • Better at conversational context resolution
  • Strong reasoning for multi-turn scheduling negotiations

Driver’s License Extraction

Model: openai/gpt-4o-mini Purpose: Extract structured data from driver’s license images using vision capabilities.
src/lib/dlExtract.ts
const MODEL = "openai/gpt-4o-mini";

const { text } = await generateText({
  model: MODEL,
  messages: [
    {
      role: "system",
      content: `You extract structured data from US driver's license images...`,
    },
    {
      role: "user",
      content: [
        {
          type: "text",
          text: "Extract the full legal name in FIRST [MIDDLE] LAST order, plus date of birth and height from this driver's license.",
        },
        {
          type: "image",
          image: `data:image/jpeg;base64,${imageBase64}`,
        },
      ],
    },
  ],
  maxOutputTokens: 200,
});
Token Budget: 200 output tokens Why GPT-4o-mini?
  • Native vision support for image analysis
  • Reliable OCR performance on ID documents
  • Fast enough for real-time onboarding

Photo AI Tags

Model: openai/gpt-4o-mini Purpose: Extract appearance and style tags from user photos.
src/lib/tpoPhotoTags.ts
const MODEL = "openai/gpt-4o-mini";

const { text } = await generateText({
  model: MODEL,
  messages: [
    {
      role: "system",
      content: `Analyze this dating profile photo...`,
    },
    {
      role: "user",
      content: [
        { type: "text", text: "Analyze this photo" },
        { type: "image", image: `data:image/jpeg;base64,${imageBase64}` },
      ],
    },
  ],
  maxOutputTokens: 150,
});
Token Budget: 150 output tokens

Onboarding Answer Quality

Model: openai/gpt-4o-mini Purpose: Evaluate if user answers are comprehensive or need follow-up questions.
src/lib/tpoAnswerQuality.ts
const MODEL = "openai/gpt-4o-mini";

const { text } = await generateText({
  model: MODEL,
  prompt: `Evaluate this onboarding answer...`,
  maxOutputTokens: 100,
});
Token Budget: 100 output tokens

Onboarding Adlibs

Model: openai/gpt-4o-mini Purpose: Generate personalized transition messages between onboarding questions.
src/lib/tpoOnboardingAdlib.ts
const MODEL = "openai/gpt-4o-mini";

const { text } = await generateText({
  model: MODEL,
  prompt: `Generate a brief, natural acknowledgment of the user's answer...`,
  maxOutputTokens: 30,
});
Token Budget: 30 output tokens

Fallback Behavior

All AI features gracefully degrade when the API key is missing:
if (!process.env.AI_GATEWAY_API_KEY) {
  // Return safe defaults instead of failing
  return EMPTY_PROFILE; // profile structuring
  return fallback;       // scheduling proposals
  return [];             // photo tags
}
// Profile structuring → empty schema
if (!process.env.AI_GATEWAY_API_KEY) {
  return {
    about: { /* all null/empty */ },
    preferences: { /* all null/empty */ }
  };
}

// Scheduling time proposal → default suggestion
if (!process.env.AI_GATEWAY_API_KEY) {
  return formatDateFriendly(addDays(today, 3)) + " at 7pm";
}

// Scheduling analysis → assume accepted
if (!process.env.AI_GATEWAY_API_KEY) {
  return {
    accepted: true,
    proposedAlternative: null,
    tooSoon: false,
    needsClarification: false,
    clarificationQuestion: null,
  };
}

// Photo tags → empty array
if (!process.env.AI_GATEWAY_API_KEY) {
  return [];
}

Error Handling

All AI calls are wrapped in try/catch with appropriate fallbacks:
try {
  const { text } = await generateText({ /* ... */ });
  return parseResponse(text);
} catch (error) {
  console.error("[ai] Model call failed:", error);
  return SAFE_FALLBACK;
}
Errors don’t crash the application - they fall back to conservative defaults that allow the user flow to continue.

Timeout Handling

Some AI calls use explicit timeouts to prevent blocking:
src/app/api/tpo/webhook/route.ts
async function withTimeout<T>(
  promise: Promise<T>,
  timeoutMs: number
): Promise<T | null> {
  return new Promise((resolve) => {
    const timer = setTimeout(() => resolve(null), timeoutMs);
    promise
      .then((value) => {
        clearTimeout(timer);
        resolve(value);
      })
      .catch(() => {
        clearTimeout(timer);
        resolve(null);
      });
  });
}

// Usage: timeout after 20 seconds
const dlData = await withTimeout(
  extractDriversLicenseData(imageBase64),
  20000
);

if (dlData) {
  // Use extracted data
} else {
  console.warn("DL extraction timed out (non-blocking)");
}

Token Budget Summary

FeatureModelOutput TokensInput Type
Profile StructuringGPT-4o-mini700Text
Time ProposalMistral Medium30Text
Response AnalysisMistral Medium160Text
Venue SuggestionMistral Medium80Text
DL ExtractionGPT-4o-mini200Vision
Photo TagsGPT-4o-mini150Vision
Answer QualityGPT-4o-mini100Text
Onboarding AdlibGPT-4o-mini30Text

Cost Optimization

Token Limits

All prompts specify maxOutputTokens to prevent runaway generation:
const { text } = await generateText({
  model: MODEL,
  prompt: "...",
  maxOutputTokens: 700, // Hard limit
});

Model Selection

Using GPT-4o-mini instead of GPT-4 for structured tasks reduces costs by ~90% while maintaining quality for well-defined extraction tasks.

Conditional Execution

Some features only run AI when needed:
src/app/api/tpo/webhook/route.ts
const answerWordCount = answer.trim().split(/\s+/).filter(Boolean).length;
const shouldAttemptAdlib =
  answerWordCount >= 4 &&  // Only for substantial answers
  !!previousQuestionId &&
  ADLIB_ENABLED_QUESTION_IDS.has(previousQuestionId);

const adlib = shouldAttemptAdlib
  ? await getOnboardingAdlib({ /* ... */ })
  : null;
Adlibs are skipped for short answers to save on API calls.

Response Parsing

All JSON responses handle markdown code fences:
function stripMarkdownCodeFence(raw: string): string {
  const trimmed = raw.trim();
  if (!trimmed.startsWith("```")) return trimmed;

  const lines = trimmed.split("\n");
  if (lines.length < 2) return trimmed;

  const contentLines = lines.slice(1);
  const lastLine = contentLines[contentLines.length - 1]?.trim();
  if (lastLine === "```") {
    contentLines.pop();
  }
  return contentLines.join("\n").trim();
}

// Then parse:
const cleaned = stripMarkdownCodeFence(text);
const parsed = JSON.parse(cleaned);
This handles both clean JSON and markdown-wrapped JSON responses.

Best Practices

1. Always Specify Token Limits

// Good
const { text } = await generateText({
  model: MODEL,
  prompt: "...",
  maxOutputTokens: 100,
});

// Bad - no limit
const { text } = await generateText({
  model: MODEL,
  prompt: "...",
});

2. Use Appropriate Models

// Good - vision task uses GPT-4o-mini
const tags = await generateText({
  model: "openai/gpt-4o-mini",
  messages: [{ type: "image", image: base64 }],
});

// Bad - Mistral doesn't support vision
const tags = await generateText({
  model: "mistral/mistral-medium",
  messages: [{ type: "image", image: base64 }],
});

3. Always Provide Fallbacks

// Good
try {
  return await aiCall();
} catch {
  return SAFE_DEFAULT;
}

// Bad - crash on failure
return await aiCall();

4. Log AI Failures

try {
  return await generateText({ /* ... */ });
} catch (error) {
  console.error("[feature] AI call failed:", error);
  return fallback;
}
Logging helps debug production issues without crashing the application.

Build docs developers (and LLMs) love