Semantic matching

GTM Feedback uses semantic search with vector embeddings to automatically match customer feedback to existing feature requests. This allows the system to understand meaning rather than just matching keywords.

How semantic matching works

Semantic matching converts text into vector embeddings (numerical representations) and compares them using cosine similarity:

Customer feedback: "Users want CSV export"
         ↓ OpenAI text-embedding-3-small
    [0.234, -0.123, 0.456, ...] (384 dimensions)
         ↓ Vector similarity search
    Upstash Vector finds similar requests
         ↓ AI agent refinement
    Match with confidence score (0.0-1.0)

Vector embeddings

Embeddings are created using OpenAI’s text-embedding-3-small model with 384 dimensions:

// packages/ai/src/embeddings/index.ts
export async function createRequestEmbedding(
  title: string,
  description: string,
  apiKey: string,
): Promise<number[] | null> {
  const openai = createOpenAI({ apiKey });
  const text = `${title}\n\n${description}`;
  
  const { embedding } = await embed({
    model: openai.embeddingModel("text-embedding-3-small"),
    value: text,
    providerOptions: {
      openai: {
        dimensions: 384, // Match Upstash Vector index dimension
      },
    },
  });

  return embedding;
}

The title and description are concatenated and converted into a 384-dimensional vector. Similar requests will have vectors that are close together in this high-dimensional space.

Storing embeddings

Embeddings are stored in Upstash Vector for fast similarity search:

// packages/ai/src/embeddings/index.ts
export async function storeRequestEmbedding(
  requestId: string,
  embedding: number[],
  url: string,
  token: string,
  metadata?: Record<string, unknown>,
): Promise<boolean> {
  const index = getVectorIndex(url, token);
  
  // Validate embedding dimension
  if (embedding.length !== 384) {
    console.error(
      `Invalid embedding dimension: expected 384, got ${embedding.length}`,
    );
    return false;
  }

  await index.upsert({
    id: requestId,
    vector: embedding,
    metadata: metadata || {},
  });

  return true;
}

Each request’s embedding is indexed by its ID, allowing fast retrieval by vector similarity.

Finding similar requests

Vector search finds the most similar requests using cosine similarity:

// packages/ai/src/embeddings/index.ts
export async function findSimilarRequests(
  embedding: number[],
  url: string,
  token: string,
  limit: number = 10,
  excludeIds: string[] = [],
): Promise<
  Array<{ id: string; score: number; metadata?: Record<string, unknown> }>
> {
  const index = getVectorIndex(url, token);

  const results = await index.query({
    vector: embedding,
    topK: limit + excludeIds.length,
    includeMetadata: true,
  });

  // Filter out excluded IDs and limit results
  return results
    .filter((result) => !excludeIds.includes(String(result.id)))
    .slice(0, limit)
    .map((result) => ({
      id: String(result.id),
      score: result.score,
      metadata: result.metadata,
    }));
}

The vector database returns similarity scores (0.0-1.0), where 1.0 is an exact match.

Three-tier confidence system

GTM Feedback uses a three-tier system to balance automation with human oversight:

High confidence

≥0.9 - Auto-add feedback to existing request without approval

Medium confidence

0.8-0.9 - Request human approval via Slack before matching

Low confidence

<0.8 - Create new feature request automatically

Implementation

The confidence thresholds are applied in the core workflow:

// apps/www/src/workflows/process-customer-entry/index.ts
export async function processCustomerFeedback(args: Args) {
  "use workflow";

  // Step 1: Use search agent to match customer pain to existing requests
  const searchResult = await searchRequestsStep(customerPain);

  const AUTO_MATCH_THRESHOLD = 0.9;
  const APPROVAL_THRESHOLD = 0.8;
  
  const topMatch = searchResult.matches[0];
  const matchResult = topMatch
    ? {
        requestId: topMatch.requestId,
        confidence: topMatch.confidence,
        reason: topMatch.reason,
        title: topMatch.title,
      }
    : {
        confidence: 0,
        requestId: null,
        title: undefined,
        reason: undefined,
      };

  // Step 3: Handle three confidence tiers
  if (
    matchResult.requestId &&
    matchResult.confidence >= AUTO_MATCH_THRESHOLD
  ) {
    // Step 3a: Auto-add feedback (high confidence >= 0.9)
    requestId = matchResult.requestId;
    await createFeedbackForRequest({
      requestId,
      userId,
      severity,
      accountId,
      opportunityId,
      customerPain,
      links,
      metadata: {
        confidence: matchResult.confidence,
        reason: matchResult.reason,
        matchType: "existing_request",
      },
    });
    
    // Send confirmation DM
    const message = await generateSlackMessageStep(
      "high_confidence_match",
      { requestUrl, requestTitle },
    );
    await sendSlackDm({ slackUserId, message });
  } else if (
    matchResult.requestId &&
    matchResult.confidence >= APPROVAL_THRESHOLD
  ) {
    // Step 3b: Require approval (medium confidence 0.8-0.9)
    const token = `feedback_match_approval:${userId}:${Date.now()}`;
    const hook = createHook({ token });

    // Send approval request to Slack
    await sendFeedbackMatchApprovalDm({
      token,
      slackUserId,
      message: approvalMessage,
    });

    // Wait for user response (could be seconds or days)
    const approval = await hook;

    if (approval.approved) {
      // User approved - add feedback to matched request
      await createFeedbackForRequest({ ... });
    } else {
      // User declined - create new request
      const newRequest = await createRequestStep({ ... });
    }
  } else {
    // Step 3c: Create new request (low confidence < 0.8)
    const productAreas = await getAllProductAreas();
    const newRequest = await createRequestStep({
      customerPain,
      productAreas,
      metadata: {
        matchConfidence: matchResult.confidence,
        matchReason: matchResult.reason,
        matchType: "new_request",
      },
    });

    await createFeedbackForRequest({ ... });
  }
}

Why these thresholds?

The thresholds are based on empirical testing:

≥0.9 - Matches at this level are almost always correct. Auto-adding saves time without introducing errors.
0.8-0.9 - Matches are usually correct but benefit from human review. Approval via Slack takes seconds and prevents mistakes.
<0.8 - Matches are uncertain enough that creating a new request is safer than forcing a match.

You can adjust these thresholds based on your data. More conservative systems might use 0.95/0.85, while more aggressive systems might use 0.85/0.75.

Matching algorithm

The search agent performs a two-stage matching process:

Stage 1: Vector similarity search

The search agent’s tool performs vector similarity search:

// packages/ai/src/agents/search/tools.ts
export const searchRequests = tool({
  description:
    "Search for request items matching a query. Uses semantic search.",
  inputSchema: z.object({
    query: z.string().describe("The search query"),
    limit: z.number().optional().default(50),
    excludeIds: z.array(z.string()).optional(),
  }),
  execute: async ({ query, limit, excludeIds }, { experimental_context }) => {
    const ctx = experimental_context as SearchToolContext;
    
    // Create embedding for semantic search
    const embedding = await ctx.createEmbedding(query);
    
    if (embedding) {
      // Perform vector similarity search
      const searchResults = await ctx.searchSimilar(
        embedding,
        limit,
        excludeIds,
      );
      
      // Fetch full request details
      const ids = searchResults.map((r) => r.id);
      const details = await ctx.fetchRequestsByIds(ids);
      
      // Merge scores with details
      return details.map((d) => ({
        ...d,
        score: searchResults.find((r) => r.id === d.id)?.score ?? 0,
      }));
    }
    
    // Fallback to recent open requests
    return await db.query.requests.findMany({
      where: (requests, { eq }) => eq(requests.status, "open"),
      orderBy: (requests, { desc }) => [desc(requests.createdAt)],
      limit: limit ?? 50,
    });
  },
});

The search agent analyzes vector search results and assigns confidence scores:

// packages/ai/src/agents/search/prompts.ts
export const SEARCH_INSTRUCTIONS = `
You are an AI agent that performs semantic search on customer feedback.

Your job is to find feature requests that match a search query.

Workflow:
1. If candidates are not provided, use the searchRequests tool to find matching items
2. Analyze the results and rank by relevance to the query
3. Return matches with requestId, title, description, confidence, and reason

Confidence scoring:
- 0.9-1.0: Query describes the exact same feature/issue
- 0.8-0.89: Strong match, same general feature with minor differences
- 0.7-0.79: Moderate match, related but not identical
- 0.5-0.69: Weak match but potentially relevant
- Below 0.5: Not a match, don't include

Return all matches with confidence >= 0.5, ordered by highest confidence first.
Be comprehensive but accurate - include potential matches rather than miss relevant ones.
`;

The agent considers:

Vector similarity scores from Stage 1
Semantic meaning of the query and request
Whether the query and request describe the same problem
Context and nuance that pure vector similarity might miss

Example matching flow

Input: "Users want to export their analytics data to CSV"

// Stage 1: Vector search
Vector search returns:
[
  { id: "req_123", title: "CSV Export Feature", score: 0.89 },
  { id: "req_456", title: "Data Export Options", score: 0.82 },
  { id: "req_789", title: "Analytics Dashboard", score: 0.65 },
]

// Stage 2: AI refinement
Agent analyzes and returns:
[
  {
    requestId: "req_123",
    title: "CSV Export Feature",
    confidence: 0.92, // Bumped up - exact match on CSV export
    reason: "Exact match - both describe CSV export for analytics",
  },
  {
    requestId: "req_456",
    title: "Data Export Options",
    confidence: 0.78, // Adjusted down - more general
    reason: "Related but broader - covers multiple export formats",
  },
  // req_789 excluded - not relevant enough
]

Batch embedding sync

For initial setup or re-indexing, you can sync all request embeddings in batch:

// packages/ai/src/embeddings/index.ts
export async function storeRequestEmbeddings(
  items: Array<{
    requestId: string;
    embedding: number[];
    metadata?: Record<string, unknown>;
  }>,
  url: string,
  token: string,
): Promise<Set<string>> {
  const index = getVectorIndex(url, token);
  const vectors = items.map((item) => ({
    id: item.requestId,
    vector: item.embedding,
    metadata: item.metadata || {},
  }));

  // Validate vector dimensions
  const invalidVectors = vectors.filter(
    (v) => !v.vector || v.vector.length !== 384,
  );
  if (invalidVectors.length > 0) {
    console.error(
      `Invalid vector dimensions: ${invalidVectors.length} vectors`,
    );
    return new Set();
  }

  // Upsert all vectors in parallel
  const results = await Promise.allSettled(
    vectors.map((vector) => index.upsert(vector)),
  );

  // Track which requestIds succeeded
  const successfulIds = new Set<string>();
  results.forEach((result, idx) => {
    if (result.status === "fulfilled") {
      successfulIds.add(vectors[idx].id);
    }
  });

  return successfulIds;
}

This is used by the sync-embeddings workflow to batch-process requests.

Best practices

When to re-index embeddings

Re-index when:

Initial setup of the system
Switching embedding models or dimensions
Migrating to a new vector database
Bulk updates to request titles/descriptions

Normal updates (single request edits) are handled automatically.

Handling multiple matches

The search agent returns multiple matches, but the workflow only uses the top match. You can modify process-customer-entry to:

Show top 3 matches for approval
Link related requests automatically
Use lower-ranked matches for “Related requests” suggestions

Adjusting confidence thresholds

Monitor match accuracy and adjust thresholds:

If too many auto-matches are wrong, increase AUTO_MATCH_THRESHOLD to 0.95
If too many new requests are created, lower APPROVAL_THRESHOLD to 0.75
Track approval rates in Slack to find optimal values

Fallback when vector search fails

If Upstash Vector is unavailable, the search tool falls back to recent open requests:

// Fallback: fetch recent open requests
const fallbackResults = await db.query.requests.findMany({
  where: (requests, { eq }) => eq(requests.status, "open"),
  orderBy: (requests, { desc }) => [desc(requests.createdAt)],
  limit,
});
// Agent will re-rank based on semantic similarity

The agent still performs semantic analysis, just without vector acceleration.

Performance considerations

Embedding creation

Text-embedding-3-small: ~0.5ms per embedding
Batch embeddings when possible using embedMany

Vector search

Upstash Vector query: ~50ms for top-100 results
Scales to millions of vectors

Overall latency

Full matching pipeline: ~2-3 seconds
- 500ms: Create embedding
- 50ms: Vector search
- 1-2s: AI agent analysis

Use the search agent’s streaming mode for real-time UX:

const stream = await agent.search.generate({
  query: customerPain,
  stream: true,
});

Next steps

AI agents

Learn about the search agent implementation

Workflows

See how matching is used in workflows

Upstash Vector docs

Read the Upstash Vector documentation

Architecture

Review the overall system architecture

Get Started

Core Concepts

Deployment

Features

Integrations

Development

How semantic matching works

Vector embeddings

Storing embeddings

Finding similar requests

Three-tier confidence system

High confidence

Medium confidence

Low confidence

Implementation

Why these thresholds?

Matching algorithm

Stage 1: Vector similarity search

Stage 2: AI agent refinement

Example matching flow

Batch embedding sync

Best practices

Performance considerations

Next steps

AI agents

Workflows

Upstash Vector docs

Architecture

Build docs developers (and LLMs) love

Get Started

Core Concepts

Deployment

Features

Integrations

Development

​How semantic matching works

​Vector embeddings

​Storing embeddings

​Finding similar requests

​Three-tier confidence system

High confidence

Medium confidence

Low confidence

​Implementation

​Why these thresholds?

​Matching algorithm

​Stage 1: Vector similarity search

​Stage 2: AI agent refinement

​Example matching flow

​Batch embedding sync

​Best practices

​Performance considerations

​Next steps

AI agents

Workflows

Upstash Vector docs

Architecture

Build docs developers (and LLMs) love

How semantic matching works

Vector embeddings

Storing embeddings

Finding similar requests

Three-tier confidence system

Implementation

Why these thresholds?

Matching algorithm

Stage 1: Vector similarity search

Stage 2: AI agent refinement

Example matching flow

Batch embedding sync

Best practices

Performance considerations

Next steps