Skip to main content

Overview

Meridian’s data flow is designed for performance, real-time collaboration, and AI-powered insights. Data flows through multiple layers, each optimized for specific operations.

Complete Data Flow Diagram

1. File Upload Flow

Step-by-Step Process

1

User Selects File

User drags and drops or selects a CSV/Excel file in the dashboard.Component: src/components/dashboard/FileUpload.tsxSupported Formats:
  • CSV (.csv)
  • Excel (.xlsx, .xls)
  • JSON (.json)
2

Upload to Cloudflare R2

File is uploaded directly to R2 object storage.
// Generate signed upload URL
const uploadUrl = await generateUploadUrl()

// Upload file to R2
const response = await fetch(uploadUrl, {
  method: 'POST',
  body: file,
})

const { storageId } = await response.json()
File: convex/r2.ts:7
3

Save Metadata to Convex

File metadata is stored in Convex database.
const fileId = await saveFile({
  storageId,
  fileName: file.name,
  fileType: file.type,
  fileSize: file.size,
})
Schema: convex/schema.ts:14-23
files: defineTable({
  storageId: v.string(),
  fileName: v.string(),
  fileType: v.string(),
  fileSize: v.number(),
  uploadedBy: v.string(),
  uploadedAt: v.number(),
  duckdbTableName: v.optional(v.string()),
  duckdbProcessed: v.optional(v.boolean()),
}).index('by_uploadedBy', ['uploadedBy'])
4

Process File into DuckDB

File is loaded into DuckDB for analytical queries.
// Get file URL from R2
const url = await getFileUrl({ storageId })

// Create DuckDB table
const result = await createTableFromCSV({
  csvUrl: url,
  tableName: fileName.replace(/\.(csv|xlsx|xls)$/, ''),
})
File: src/utils/duckdb.ts:80-148DuckDB Process:
  1. Fetch CSV from R2 URL
  2. Save to temporary file
  3. Use read_csv_auto() to infer schema
  4. Create table with inferred types
  5. Clean up temporary file
5

Update File Status

Mark file as processed in Convex.
await updateDuckDBInfo({
  fileId,
  tableName: sanitizedTableName,
})
File: convex/csv.ts:111-134
6

User Sees Data

Frontend automatically updates via Convex subscription.
// Real-time subscription
const { data: files } = useQuery(convexQuery(api.csv.getFiles, {}))
File appears in dashboard with “Processed” badge.

Upload Flow Code References

// Client-side upload flow
// File: src/components/dashboard/FileUpload.tsx

1. User drops fileMantine Dropzone
2. Get upload URLconvex/r2.ts:7 (generateUploadUrl)
3. Upload to R2fetch(uploadUrl, { body: file })
4. Save metadataconvex/csv.ts:10 (saveFile mutation)
5. Process filesrc/utils/duckdb.ts:80 (createTableFromCSV)
6. Update statusconvex/csv.ts:111 (updateDuckDBInfo mutation)

2. Query Execution Flow

SQL Query Path

// User types SQL in QueryEditor
const [query, setQuery] = useState('SELECT * FROM table')

// User clicks Execute
await handleExecuteQuery()

Query Execution Sequence

┌─────────────────────────────────────────────────────────────┐
│  1. User Input                                              │
│  QueryEditor component (src/components/QueryEditor.tsx)     │
└──────────────────┬──────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  2. Client Handler                                          │
│  handleExecuteQuery() in table.$table.tsx:449              │
│  - Validate query                                           │
│  - Start loading state                                      │
└──────────────────┬──────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  3. Server Function                                         │
│  queryDuckDB() via TanStack Start                           │
│  - Routes to /api/duckdb/query endpoint                    │
└──────────────────┬──────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  4. DuckDB Execution                                        │
│  src/utils/duckdb.ts:32                                     │
│  - Get DuckDB instance                                      │
│  - Execute SQL query                                        │
│  - Parse results                                            │
└──────────────────┬──────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  5. Parallel Operations                                     │
│  ├─ Log query → Convex (convex/queryLog.ts)               │
│  ├─ Broadcast notification → Convex                        │
│  └─ Generate statistics → analyzeTableWithDuckDB()         │
└──────────────────┬──────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  6. UI Update                                               │
│  - Invalidate TanStack Query cache                          │
│  - Re-render table with new data                            │
│  - Update statistics panel                                  │
│  - Show success notification                                │
└─────────────────────────────────────────────────────────────┘

3. AI Agent Flow

Natural Language to SQL

1

User Asks Question

User types natural language in the Agent panel.
// Component: src/components/AgentEditor.tsx
const [agentInput, setAgentInput] = useState('')
// User types: "Show me top 10 customers by revenue"
2

Context Preparation

Frontend gathers table context.
const response = await askGemini({
  prompt: agentInput,
  tableName: table,
  columns: data.columns,
  sampleRows: data.rows.slice(0, 3),
  mode: 'query', // or 'analysis'
})
File: convex/table_agent.ts:496
3

AI Generates SQL

Gemini generates DuckDB SQL queries.
// Agent constructs contextual prompt
const contextualPrompt = `
TABLE CONTEXT:
- Table Name: ${tableName}
- Columns: ${describeColumns(columns)}
- Sample data: ${JSON.stringify(sampleRows)}

USER REQUEST:
${prompt}

Please write DuckDB SQL queries.
`

// Stream structured response
const response = await agent.streamObject(ctx, thread, {
  prompt: contextualPrompt,
  schema: z.object({
    commands: z.array(z.string()).min(1).max(10),
    description: z.string(),
  }),
})
File: convex/table_agent.ts:610-711
4

Execute Generated Queries

Queries are queued and executed sequentially.
// Set up command queue
setCommandQueue(response.commands)
setCurrentCommandIndex(0)
setQuery(response.commands[0])

// User clicks Execute to run each query
File: src/routes/_authed/table.$table.tsx:73-75
5

Store Conversation

Agent messages are persisted for history.
// Create or update thread
await ctx.runMutation(api.agent_utils.insertAgentMessageRecord, {
  threadId: thread._id,
  role: 'user',
  content: prompt,
})

await ctx.runMutation(api.agent_utils.insertAgentMessageRecord, {
  threadId: thread._id,
  role: 'assistant',
  content: description,
  commands: commands,
})
Schema: convex/schema.ts:63-102

Analysis Mode with Tools

When in analysis mode, the agent uses tools to explore data:
// Define available tools
const analysis_agent = new Agent(components.agent, {
  name: 'analysis_agent',
  languageModel: model,
  tools: {
    queryDuckDB,        // Execute SQL
    getTableSchema,     // Get columns
    getSampleRows,      // Get sample data
    createChart,        // Generate visualization
    generateInsights,   // Analyze patterns
    firecrawlSearch,    // Search web
    scrapeWebPage,      // Get web content
    analyzeDataQuality, // Quality checks
  },
})

// Agent decides which tools to use
const response = await agent.streamText(ctx, thread, {
  prompt: contextualPrompt,
})
Tool Execution Flow:
User asks: "What are the data quality issues?"

Agent calls: analyzeDataQuality({ tableName })

Tool executes: Multiple SQL queries to check:
  - Null percentages
  - Duplicate values
  - Empty strings
  - Data type consistency

Tool returns: {
  issues: [...],
  qualityScore: 85,
}

Agent synthesizes: "Found 3 issues:
  1. 15% null values in email column
  2. Duplicate IDs in user_id
  3. Empty strings in address field"

4. Insights Generation Flow

Statistical Analysis Pipeline

User clicks “Generate Insights” button.
const handleGenerateInsights = async () => {
  await generateInsightsForData(data, false)
}
File: src/routes/_authed/table.$table.tsx:620

5. Real-time Collaboration Flow

Presence & Notifications

Meridian supports real-time collaboration through Convex subscriptions.
// Subscribe to notifications for a table
const notifications = useQuery(
  api.notifications.getNotifications,
  { tableName, limit: 20 }
)

// Component: src/components/TableNotifications.tsx:1

Notification Types

Query Execution

When a user executes a SQL query
type: 'query',
message: 'executed a query: SELECT...'

AI Agent Query

When AI generates SQL
type: 'agent_query',
message: 'asked AI to generate query'

AI Analysis

When AI analyzes data
type: 'agent_analysis',
message: 'asked AI for analysis'

Insights Generated

When insights are created
type: 'insights_generated',
message: 'generated 5 insights'

Chart Created

When visualization is generated
type: 'chart_created',
message: 'created 2 charts'

Real-time Update Flow

User A executes query

Broadcast notification to Convex

Convex triggers subscription update

All clients subscribed to table receive notification

User B sees: "User A executed a query: SELECT..."

User B's data automatically refreshes
Implementation:
// File: src/components/TableNotifications.tsx
const handleRemoteQueryExecuted = useCallback(() => {
  // Invalidate cache to refetch data
  queryClient.invalidateQueries({ queryKey: ['tables', table] })
  setPageIndex(0)
}, [queryClient, table])

// Listen for notifications
if (notification.type === 'query' && notification.userId !== currentUserId) {
  handleRemoteQueryExecuted()
}

6. Chart Generation Flow

From AI Tool to Visualization

1

AI Calls createChart Tool

During analysis, agent decides to create visualization.
// File: convex/agent_tools.ts
export const createChart = {
  description: 'Create a chart visualization',
  parameters: z.object({
    query: z.string(),
    chartType: z.enum(['line', 'bar', 'area', 'pie']),
    title: z.string(),
    xAxis: z.string(),
    yAxis: z.string(),
  }),
  execute: async (args) => {
    // Execute query
    // Analyze data structure
    // Return chart configuration
  }
}
2

Tool Executes Query

Chart-specific SQL query runs.
const result = await ctx.runAction(
  api.table_agent.fetchDuckDBQuery,
  { query: args.query }
)
3

Generate Chart Config

Tool returns Recharts configuration.
return {
  success: true,
  chart: {
    type: 'line',
    title: 'Revenue Over Time',
    data: chartData,
    xAxisKey: 'date',
    yAxisKey: 'revenue',
    series: [{ name: 'revenue', color: 'blue' }],
  }
}
4

Extract from Tool Steps

Frontend extracts chart configs from agent messages.
// File: src/routes/_authed/table.$table.tsx:183-227
const extractedCharts = useMemo(() => {
  const charts = []
  agentMessages.forEach((message) => {
    if (message.toolSteps) {
      message.toolSteps.forEach((step) => {
        if (step.tool === 'createChart' && step.result?.chart) {
          charts.push({
            id: `chart-${message.createdAt}-${stepIndex}`,
            config: step.result.chart,
            position: { x, y },
          })
        }
      })
    }
  })
  return charts
}, [agentMessages])
5

Render Chart

Charts displayed in canvas.
// Component: src/components/ChartCanvas.tsx
<ChartCanvas
  charts={charts}
  onRemoveChart={(id) => setCharts(prev => prev.filter(c => c.id !== id))}
  onChartMove={(id, position) => { /* update position */ }}
/>
6

Auto-refresh on Data Change

Charts re-execute queries when table data updates.
useEffect(() => {
  // Detect data changes
  const currentDataVersion = JSON.stringify({
    rowCount: data.rows.length,
    sampleHash: JSON.stringify(data.rows.slice(0, 3)),
  })
  
  // Re-execute all chart queries
  if (previousDataVersion !== currentDataVersion) {
    updateCharts()
  }
}, [data])
File: src/routes/_authed/table.$table.tsx:283-447

Performance Optimizations

Caching Strategy

Query results cached for 5 minutes.
queryClient.setQueryDefaults(['tables'], {
  staleTime: 5 * 60 * 1000,
  cacheTime: 10 * 60 * 1000,
})
AI-generated insights cached in Convex.
// Check cache before generating
const cached = await ctx.db
  .query('insightsCache')
  .withIndex('by_cacheKey', (q) => q.eq('cacheKey', cacheKey))
  .first()
Single DuckDB instance reused across requests.
let duckDBInstance: DuckDBInstance | null = null

export const getDuckDB = async () => {
  if (!duckDBInstance) {
    duckDBInstance = await DuckDBInstance.create(url)
  }
  return duckDBInstance
}

Streaming Optimizations

AI Responses Stream Incrementally:
// Stream text responses
for await (const chunk of stream.fullStream) {
  if (chunk.type === 'text-delta') {
    assistantText += chunk.text
    // Update UI immediately
    await updateMessage({ content: assistantText })
  }
}
Large Datasets Paginated:
const paginatedRows = useMemo(() => {
  const start = pageIndex * pageSize
  const end = start + pageSize
  return data.rows.slice(start, end)
}, [data.rows, pageIndex, pageSize])

Error Handling

Graceful Degradation

try {
  const result = await queryDuckDB({ data: query })
  await logQuery({ query, success: true })
} catch (err) {
  // Log error
  await logQuery({ query, success: false, error: err.message })
  // Show user-friendly message
  setError('Query failed. Please check your SQL syntax.')
  // Don't crash the app
}

Next Steps

Architecture Overview

Understand the overall system design

Tech Stack

Explore detailed technology information

Build docs developers (and LLMs) love