Skip to main content

Pinecone Vector Search Service

Client-side TypeScript service for Pinecone vector database operations. Provides RAG (Retrieval-Augmented Generation) context for AI generation. Location: src/services/pinecone.ts

Overview

The Pinecone service provides:
  • Knowledge Base Queries - Semantic search for brand guidelines, design patterns, writing rules
  • Document Indexing - Store and chunk documents for retrieval
  • Context Generation - Fetch relevant context for AI prompts
  • Database Management - Clear and rebuild knowledge base
All requests proxy through /api/pinecone/* endpoints, which handle embeddings and API keys.

Knowledge Base Sources

The knowledge base contains:
SourceContentUse Case
brand-manualYBH brand guidelines, voice, colors, typographyPRF, hooks, social posts
anti-ai-writing-guideWriting style rules, phrases to avoidAll text generation
brand-narrativeBrand story, sound bites, messagingSocial posts, hooks
infographic-design-databaseLayout structures (28 types), content templates (10 types), visual style systemsInfographic specs

queryKnowledgeBase()

Query the knowledge base for relevant context.
function queryKnowledgeBase(
  query: string,
  options?: {
    topK?: number
    filter?: { source?: string }
  }
): Promise<KnowledgeResult[]>

Parameters

query
string
required
Search query (semantic, not keyword-based)
options.topK
number
default:"5"
Number of results to return (max: 20)
options.filter.source
string
Filter by source document (e.g., "brand-manual")

Returns

interface KnowledgeResult {
  id: string          // Chunk ID
  score: number       // Similarity score (0-1)
  text: string        // Chunk text
  source: string      // Source document
  section: string     // Section heading
}

Example

import { queryKnowledgeBase } from '@/services/pinecone'

const results = await queryKnowledgeBase(
  'brand voice and tone for IT leaders',
  { topK: 3 }
)

for (const result of results) {
  console.log(`[${result.source}] ${result.section}`)
  console.log(`Score: ${result.score.toFixed(2)}`)
  console.log(result.text)
  console.log('---')
}
Output:
[brand-manual] Voice
Score: 0.87
We don't sell. We unsell. Anti-spin, Anti-transactional, Pro-IT leader.
---
[anti-ai-writing-guide] Avoid
Score: 0.82
Don't use phrases like "In today's rapidly changing world" or "Let's dive in".
---

Specialized Query Functions

Pre-configured queries for common use cases.

queryBrandGuidelines()

Query brand guidelines specifically.
function queryBrandGuidelines(query: string): Promise<KnowledgeResult[]>

Example

import { queryBrandGuidelines } from '@/services/pinecone'

const guidelines = await queryBrandGuidelines('color palette')
// Returns 3 results from brand-manual

queryInfographicDesign()

Query infographic design database.
function queryInfographicDesign(query: string): Promise<KnowledgeResult[]>

Example

import { queryInfographicDesign } from '@/services/pinecone'

const designs = await queryInfographicDesign('circular layout structures')
// Returns 5 results from infographic-design-database

queryWritingGuidelines()

Query writing style rules.
function queryWritingGuidelines(query: string): Promise<KnowledgeResult[]>

Example

import { queryWritingGuidelines } from '@/services/pinecone'

const rules = await queryWritingGuidelines('phrases to avoid')
// Returns 3 results from anti-ai-writing-guide

queryBrandNarrative()

Query brand story and messaging.
function queryBrandNarrative(query: string): Promise<KnowledgeResult[]>

Example

import { queryBrandNarrative } from '@/services/pinecone'

const narrative = await queryBrandNarrative('sound bites')
// Returns 3 results from brand-narrative

getGenerationContext()

Get combined context from multiple sources for AI generation.
function getGenerationContext(
  query: string,
  sources?: ('brand' | 'infographic' | 'writing' | 'narrative')[]
): Promise<string>

Parameters

sources
array
default:"['brand', 'writing']"
Sources to query:
  • 'brand' - Brand guidelines
  • 'infographic' - Design database
  • 'writing' - Writing rules
  • 'narrative' - Brand story

Returns

string  // Formatted context block for AI prompts

Example

import { getGenerationContext } from '@/services/pinecone'

const context = await getGenerationContext(
  'create LinkedIn post for IT leaders',
  ['brand', 'writing', 'narrative']
)

const prompt = `
Generate a LinkedIn post.

${context}

Topic: ${topicText}
`
Output:
## Knowledge Base Context

[Brand Manual - Voice]
We don't sell. We unsell. Anti-spin, Anti-transactional, Pro-IT leader.

---

[Anti Ai Writing Guide - Avoid]
Don't use phrases like "In today's rapidly changing world" or "Let's dive in".

---

[Brand Narrative - Headlines]
"Doing to IT what the iPhone did to the Blackberry"

indexDocuments()

Index new documents into the knowledge base.
function indexDocuments(
  documents: DocumentChunk[]
): Promise<{
  success: boolean
  indexed: number
  message: string
}>

Parameters

interface DocumentChunk {
  id: string          // Unique chunk ID (e.g., "brand-manual-0")
  text: string        // Chunk text (500-1000 chars)
  source: string      // Source document name
  section?: string    // Section heading
}

Example

import { indexDocuments, chunkDocument } from '@/services/pinecone'

// 1. Chunk document
const document = `
# Brand Voice

We don't sell. We unsell.
Anti-spin, Anti-transactional, Pro-IT leader.

# Color Palette

Primary: Yellow (#F7B500), Orange (#F17529), Red (#EF4136)
`

const chunks = chunkDocument(document, 'brand-manual', {
  maxChunkSize: 800,
  overlap: 100
})

// 2. Index chunks
const result = await indexDocuments(chunks)
console.log(`Indexed ${result.indexed} chunks`)

chunkDocument()

Chunk large documents for indexing.
function chunkDocument(
  text: string,
  source: string,
  options?: {
    maxChunkSize?: number
    overlap?: number
  }
): DocumentChunk[]

Parameters

options.maxChunkSize
number
default:"800"
Maximum characters per chunk
options.overlap
number
default:"100"
Overlap between chunks (prevents splitting mid-sentence)

Example

import { chunkDocument } from '@/services/pinecone'

const longDoc = `...10,000 characters...`

const chunks = chunkDocument(longDoc, 'design-guide', {
  maxChunkSize: 800,
  overlap: 100
})

console.log(`Created ${chunks.length} chunks`)

for (const chunk of chunks) {
  console.log(`Chunk ${chunk.id}: ${chunk.text.length} chars`)
  console.log(`Section: ${chunk.section}`)
}

Section Detection

Automatic section header detection:
// ALL CAPS lines
BRAND VOICE

// Markdown headers
# Brand Voice
## Color Palette

// Numbered/lettered items
A. CIRCULAR LAYOUTS
1. The Doom Loop

clearKnowledgeBase()

Clear all documents from the knowledge base.
function clearKnowledgeBase(): Promise<{
  success: boolean
  message: string
}>

Example

import { clearKnowledgeBase } from '@/services/pinecone'

if (confirm('Clear entire knowledge base?')) {
  const result = await clearKnowledgeBase()
  console.log(result.message)
}
Permanent Deletion: This removes all indexed documents. You’ll need to re-index to restore content.

Embedding Configuration

Embeddings are generated via OpenRouter:
{
  model: 'openai/text-embedding-3-small',
  dimensions: 768
}
  • Model: OpenAI’s text-embedding-3-small
  • Dimensions: 768 (Pinecone index configured for this)
  • Cost: ~$0.02 per 1M tokens

Similarity Scores

Similarity scores range from 0 to 1:
ScoreInterpretation
0.9+Highly relevant, exact match
0.7-0.9Relevant, good match
0.5-0.7Somewhat relevant
0.3-0.5Weak relevance
<0.3Not relevant (usually filtered out)

Filtering by Score

const results = await queryKnowledgeBase(query, { topK: 10 })

// Keep only high-quality matches
const relevant = results.filter(r => r.score > 0.5)

RAG in AI Generation

How RAG is used in content generation:

Example Flow

// 1. User requests PRF generation
const transcript = '...'

// 2. API generates embedding for RAG query
const query = transcript.slice(0, 500)
const embedding = await getEmbedding(query)

// 3. Query Pinecone for brand context
const results = await queryPinecone(embedding, {
  topK: 5,
  filter: { source: { $in: ['brand-manual', 'anti-ai-writing-guide'] } }
})

// 4. Format context
const ragContext = formatContext(results)

// 5. Append to system prompt
const fullPrompt = systemPrompt + ragContext

// 6. Send to Claude
const prf = await generateWithClaude(fullPrompt, transcript)

Best Practices

1. Semantic Queries

Write queries as natural questions or topics:
await queryKnowledgeBase('How should we write for IT leaders?')
await queryKnowledgeBase('color palette for infographics')
await queryKnowledgeBase('circular layout structures')

2. Filter by Source

Use filters to narrow results:
// For infographic generation
const designs = await queryKnowledgeBase('timeline layouts', {
  topK: 5,
  filter: { source: 'infographic-design-database' }
})

// For social posts
const guidelines = await queryKnowledgeBase('LinkedIn post style', {
  topK: 3,
  filter: { source: 'brand-manual' }
})

3. Adjust topK

More results = more context, but slower:
// Quick check (low latency)
const quick = await queryKnowledgeBase(query, { topK: 3 })

// Comprehensive (higher latency)
const detailed = await queryKnowledgeBase(query, { topK: 10 })

4. Deduplicate Results

When combining multiple sources:
const brandResults = await queryBrandGuidelines(query)
const writingResults = await queryWritingGuidelines(query)

const allResults = [...brandResults, ...writingResults]

// Remove duplicates by ID
const seen = new Set<string>()
const unique = allResults.filter(r => {
  if (seen.has(r.id)) return false
  seen.add(r.id)
  return true
})

Error Handling

All functions throw errors that should be caught:
try {
  const results = await queryKnowledgeBase(query)
} catch (error) {
  if (error.message.includes('API error: 401')) {
    console.error('Not authenticated')
  } else if (error.message.includes('Query failed')) {
    console.error('Pinecone query failed:', error.message)
  } else {
    console.error('Unknown error:', error.message)
  }
}

Performance

OperationLatencyNotes
Query (topK=5)~300-500msIncludes embedding + Pinecone query
Query (topK=10)~400-600msSlightly slower
Index (100 chunks)~2-3sIncludes embedding generation
Clear database~500msFast operation

TypeScript Types

All types are exported:
import type { 
  KnowledgeResult, 
  DocumentChunk 
} from '@/services/pinecone'

function MyComponent() {
  const [results, setResults] = useState<KnowledgeResult[]>([])
  // ...
}

Build docs developers (and LLMs) love