How Screen Capture Works
When you trigger screen capture, Tabby:Captures the Active Window
Takes a screenshot of your current screen using Electron’s native capture API
Sends to Vision Model
Transmits the image to an AI vision model (GPT-4 Vision, Claude 3.5 Sonnet, etc.)
Triggering Screen Capture
Analyze New Problem
PressAlt+X to capture and analyze a coding problem:
Make sure the coding problem is clearly visible on your screen before pressing Alt+X. The AI needs to see the problem statement, constraints, and examples.
Update with New Constraints
PressAlt+Shift+X to update the analysis when constraints change:
Get Code Suggestions
PressAlt+N to get improvement suggestions:
What the AI Sees
The vision model analyzes your screenshot for:Problem Statement
The main problem description and what you need to solve
Constraints
Time/space complexity requirements and input limits
Examples
Sample inputs and expected outputs
Follow-ups
Additional questions or edge cases mentioned
Custom Prompts with Screenshots
You can also provide custom prompts with optional screenshot capture:Technical Implementation
Backend API Route
The screen capture is processed by the Interview Copilot API:Analysis Schema
The AI generates structured output matching this schema:Best Practices
For Clear Capture
Common Issues
Screenshot shows wrong window
Screenshot shows wrong window
Tabby captures the currently active window. Make sure the coding problem window is focused before pressing Alt+X.
AI misunderstands the problem
AI misunderstands the problem
Try:
- Re-capture with better visibility
- Use the Chat tab to clarify specific points
- Provide a custom prompt with more context
Capture is too slow
Capture is too slow
Screen capture speed depends on:
- Your screen resolution (lower = faster)
- Network speed (if using cloud AI)
- AI model choice (faster models available in settings)
Privacy & Security
Data Flow
- Local Capture: Screenshot taken on your machine
- Encoding: Converted to base64 for transmission
- API Call: Sent to AI provider’s vision endpoint
- Processing: AI analyzes and generates response
- Deletion: Screenshot discarded after analysis
Only the text analysis results are saved to your conversation history, not the original screenshots.
Performance Tips
Optimize Capture Speed
- Use faster models: Select models optimized for vision in Settings
- Reduce resolution: Lower screen resolution = smaller images = faster upload
- Local models: Use Ollama or LM Studio for zero network latency
- Limit content: Focus on the problem area, not full screen
API Cost Optimization
- OpenAI
- Anthropic
- Local
GPT-4 Vision charges per image token. Costs vary by resolution:
- 1024x1024: ~$0.01 per analysis
- 512x512: ~$0.003 per analysis
Next Steps
Explore Tabs
Learn about all seven analysis tabs
Memory System
How memories enhance screen analysis
Settings
Configure vision models and providers
Troubleshooting
Common issues and solutions