Screen Capture & Analysis

Screen capture is the core technology that powers Interview Copilot. It enables Tabby to “see” the coding problem on your screen and provide intelligent analysis.

How Screen Capture Works

When you trigger screen capture, Tabby:

Captures the Active Window

Takes a screenshot of your current screen using Electron’s native capture API

Encodes the Image

Converts the screenshot to a base64-encoded image for transmission

Sends to Vision Model

Transmits the image to an AI vision model (GPT-4 Vision, Claude 3.5 Sonnet, etc.)

Analyzes Content

The AI extracts the problem statement, constraints, examples, and requirements

Generates Structured Response

Creates comprehensive analysis across all tabs (Idea, Code, Walkthrough, etc.)

Triggering Screen Capture

Analyze New Problem

Press Alt+X to capture and analyze a coding problem:

const handleAnalyze = async () => {
  const screenshot = await window.electron.captureScreen();
  sendMessage(
    "Analyze this coding problem. Provide Idea, Code, Walkthrough, and Test Cases.",
    { screenshot }
  );
};

Make sure the coding problem is clearly visible on your screen before pressing Alt+X. The AI needs to see the problem statement, constraints, and examples.

Update with New Constraints

Press Alt+Shift+X to update the analysis when constraints change:

const handleUpdate = async () => {
  const screenshot = await window.electron.captureScreen();
  sendMessage(
    "The interviewer added new constraints. Update the analysis for all sections.",
    { screenshot }
  );
};

Get Code Suggestions

Press Alt+N to get improvement suggestions:

const handleCodeSuggestion = async () => {
  const screenshot = await window.electron.captureScreen();
  sendMessage(
    "Suggest improvements to the current code approach. Focus on optimization and clean code.",
    { screenshot }
  );
};

What the AI Sees

The vision model analyzes your screenshot for:

Problem Statement

The main problem description and what you need to solve

Constraints

Time/space complexity requirements and input limits

Examples

Sample inputs and expected outputs

Follow-ups

Additional questions or edge cases mentioned

Custom Prompts with Screenshots

You can also provide custom prompts with optional screenshot capture:

// With screenshot
handleCustomPrompt(
  "Focus on the time complexity optimization for this problem",
  includeScreenshot: true
);

// Without screenshot
handleCustomPrompt(
  "Explain the trade-offs between HashMap and TreeMap here",
  includeScreenshot: false
);

Use the prompt input at the bottom of the Interview Copilot panel to enter custom instructions. Toggle the camera icon to include/exclude screenshots.

Technical Implementation

Backend API Route

The screen capture is processed by the Interview Copilot API:

export async function POST(req: Request) {
  const { messages, conversationId, screenshot } = await req.json();
  
  // Attach screenshot to the last message
  if (screenshot) {
    const lastMsg = modelMessages[modelMessages.length - 1];
    if (lastMsg && lastMsg.role === 'user') {
      lastMsg.content = [
        { type: 'text', text: lastMsg.content },
        { type: 'image', image: screenshot }
      ];
    }
  }
  
  // Stream response with structured analysis
  return streamText({
    model: myProvider.languageModel(defaultModel),
    messages: modelMessages,
    output: Output.object({ schema: analysisSchema }),
    // ...
  });
}

Analysis Schema

The AI generates structured output matching this schema:

const analysisSchema = z.object({
  idea: z.string().describe('Problem understanding, key observations, approaches'),
  code: z.string().describe('Clean, well-commented implementation code'),
  walkthrough: z.string().describe('Step-by-step explanation of the solution'),
  testCases: z.array(z.object({
    input: z.string(),
    output: z.string(),
    reason: z.string(),
  })).describe('Edge cases and test inputs'),
  mistakes: z.array(z.object({
    mistake: z.string(),
    correction: z.string(),
    pattern: z.string(),
  })).describe('Common mistakes for this problem type'),
  memories: z.array(z.object({
    memory: z.string(),
    createdAt: z.string(),
  })).describe('Relevant memories about user preferences'),
});

Best Practices

For Clear Capture

Full Problem Visible

Ensure the entire problem statement fits on screen before capturing

Remove Distractions

Close unnecessary windows or notifications that might confuse the AI

Good Contrast

Use a readable theme with good contrast between text and background

Zoom if Needed

If text is too small, zoom in before capturing

Common Issues

Screenshot shows wrong window

Tabby captures the currently active window. Make sure the coding problem window is focused before pressing Alt+X.

AI misunderstands the problem

Try:

Re-capture with better visibility
Use the Chat tab to clarify specific points
Provide a custom prompt with more context

Capture is too slow

Screen capture speed depends on:

Your screen resolution (lower = faster)
Network speed (if using cloud AI)
AI model choice (faster models available in settings)

Privacy & Security

Important Privacy Information

Screenshots are sent to your configured AI provider (OpenAI, Anthropic, etc.)
Images are not stored locally after processing
Screenshots are not saved to your conversation history by default
Be mindful of sensitive information visible on screen
Consider using local AI models for maximum privacy

Data Flow

Local Capture: Screenshot taken on your machine
Encoding: Converted to base64 for transmission
API Call: Sent to AI provider’s vision endpoint
Processing: AI analyzes and generates response
Deletion: Screenshot discarded after analysis

Only the text analysis results are saved to your conversation history, not the original screenshots.

Performance Tips

Optimize Capture Speed

Use faster models: Select models optimized for vision in Settings
Reduce resolution: Lower screen resolution = smaller images = faster upload
Local models: Use Ollama or LM Studio for zero network latency
Limit content: Focus on the problem area, not full screen

API Cost Optimization

OpenAI
Anthropic
Local

GPT-4 Vision charges per image token. Costs vary by resolution:

1024x1024: ~$0.01 per analysis
512x512: ~$0.003 per analysis

Next Steps

Explore Tabs

Learn about all seven analysis tabs

Memory System

How memories enhance screen analysis

Settings

Configure vision models and providers

Troubleshooting

Common issues and solutions

Interview Copilot

AI Assistance

Memory & Brain

Voice Features

Automation

How Screen Capture Works

Triggering Screen Capture

Analyze New Problem

Update with New Constraints

Get Code Suggestions

What the AI Sees

Problem Statement

Constraints

Examples

Follow-ups

Custom Prompts with Screenshots

Technical Implementation

Backend API Route

Analysis Schema

Best Practices

For Clear Capture

Common Issues

Privacy & Security

Data Flow

Performance Tips

Optimize Capture Speed

API Cost Optimization

Next Steps

Explore Tabs

Memory System

Settings

Troubleshooting

Build docs developers (and LLMs) love

Interview Copilot

AI Assistance

Memory & Brain

Voice Features

Automation

​How Screen Capture Works

​Triggering Screen Capture

​Analyze New Problem

​Update with New Constraints

​Get Code Suggestions

​What the AI Sees

Problem Statement

Constraints

Examples

Follow-ups

​Custom Prompts with Screenshots

​Technical Implementation

​Backend API Route

​Analysis Schema

​Best Practices

​For Clear Capture

​Common Issues

​Privacy & Security

​Data Flow

​Performance Tips

​Optimize Capture Speed

​API Cost Optimization

​Next Steps

Explore Tabs

Memory System

Settings

Troubleshooting

Build docs developers (and LLMs) love

How Screen Capture Works

Triggering Screen Capture

Analyze New Problem

Update with New Constraints

Get Code Suggestions

What the AI Sees

Custom Prompts with Screenshots

Technical Implementation

Backend API Route

Analysis Schema

Best Practices

For Clear Capture

Common Issues

Privacy & Security

Data Flow

Performance Tips

Optimize Capture Speed

API Cost Optimization

Next Steps