Features
- Advanced Reasoning: Optimized for complex problem-solving
- Cost-Effective Vision: Uses free local OCR (Tesseract.js) instead of expensive vision APIs
- High-Level Analysis: Excellent for analytical and technical tasks
- ChatGPT-Compatible API: Standard OpenAI-compatible endpoint
Setup
Get API Key
Obtain your K2 Think API key from the K2 Think service provider.
K2 Think uses a specialized API endpoint separate from OpenAI or other providers.
Configuration
Environment Variables
| Variable | Required | Description | Default |
|---|---|---|---|
K2_THINK_API_KEY | Yes | Your K2 Think API key | - |
USE_K2_THINK | Yes | Enable K2 Think mode | false |
Default Model
Cluely uses the K2-Think-v2 model:The model name is fixed in the current implementation. Custom model selection will be added in future versions.
API Implementation
Endpoint
K2 Think uses this API endpoint:Request Configuration
Headers
| Header | Value | Purpose |
|---|---|---|
Authorization | Bearer {API_KEY} | Authentication |
accept | application/json | Response format |
Content-Type | application/json | Request format |
Local OCR for Images
K2 Think uses free local OCR (Tesseract.js) to extract text from images before analysis. This eliminates vision API costs.
How It Works
Benefits of Local OCR
Zero Vision Costs
No expensive vision API calls - OCR runs entirely on your machine
Privacy
Images processed locally before text is sent to API
Fast Processing
Tesseract.js is optimized for speed
Offline Capable
OCR works without internet (API call still requires connection)
OCR Accuracy
Use Cases
K2 Think V2 excels at:Complex Problem Solving
Complex Problem Solving
- Multi-step reasoning tasks
- Technical analysis
- Algorithm design
- System architecture planning
Code Analysis
Code Analysis
- Debug error messages (via OCR)
- Code review and optimization
- Architecture discussions
- Technical documentation
Research & Analysis
Research & Analysis
- Academic problem solving
- Mathematical reasoning
- Logical deduction
- Comparative analysis
Limitations
Image Analysis
- Good for: Text-heavy screenshots, code, error messages, documents
- Not ideal for: Charts, diagrams, photos, visual designs
Audio Processing
Audio features are not supported. Cluely uses Gemini for all voice functionality, even when K2 Think is enabled.
Streaming
Current implementation uses non-streaming responses:Switching to K2 Think
At Startup
Set environment variables:At Runtime
Switch from another provider:Priority Order
K2 Think has highest priority when enabled:Testing Connection
Verify K2 Think is configured correctly:Troubleshooting
K2 Think Not Activating
K2 Think Not Activating
Issue: Still using Gemini/other provider despite configurationSolutions:
- Verify
USE_K2_THINK=trueis set in.env - Check
K2_THINK_API_KEYis not empty - Restart Cluely after updating
.env - Check console for:
[LLMHelper] Using K2 Think V2...
API Key Errors
API Key Errors
Error:
K2 Think API key is not configured or K2 Think API key is requiredSolutions:- Verify API key is set in
.env - Check for extra spaces or quotes
- Ensure key is valid and active
- Test with:
llmHelper.testConnection()
OCR Not Working
OCR Not Working
Issue: Images not being analyzed properlySolutions:
- Ensure screenshots are high-resolution
- Check console for:
[LLMHelper] Starting Local OCR... - Verify Tesseract.js is installed: Check
package.jsondependencies - Try with clear, text-heavy screenshots first
API Errors
API Errors
Error:
K2 Think API error: 401/403/500Solutions:- 401 Unauthorized: Check API key is correct
- 403 Forbidden: Verify account has access to K2-Think-v2 model
- 500 Server Error: K2 Think service may be down, try again later
- Rate Limiting: Wait before retrying
OCR Best Practices
Optimize Screenshots
- Use high resolution (1920x1080 or higher)
- Ensure good contrast between text and background
- Avoid heavy compression or artifacts
Content Type
Best for:
- Code snippets
- Error messages
- Text documents
- Console output
- Heavily styled text
- Cursive fonts
- Overlapping elements
Advanced Configuration
Custom System Prompts
K2 Think uses Cluely’s standard system prompt (source/electron/LLMHelper.ts:12-22):Response Processing
K2 Think responses are cleaned of markdown formatting:Comparison with Other Providers
| Feature | K2 Think V2 | Gemini | Ollama | OpenRouter |
|---|---|---|---|---|
| Reasoning | Advanced | Excellent | Good | Varies |
| Vision | OCR-based | Native | None | Model-dependent |
| Privacy | Cloud + Local OCR | Cloud | 100% Local | Cloud |
| Cost | API costs | API costs | Free | API costs |
| Speed | Moderate | Very Fast | Fast | Fast |
Next Steps
Provider Overview
Compare all AI providers
Gemini Setup
Add Gemini for native vision capabilities