Overview
AI completions in Glass:- Real-time suggestions - Appear as you type
- Context-aware - Uses surrounding code and project structure
- Multi-line support - Suggests entire functions or blocks
- Multiple providers - Choose between Zed Cloud, Ollama, or custom endpoints
Code completions are also called “edit predictions” or “autocomplete” in the codebase.
Completion Providers
- Zed Predict
- Ollama
- Custom
Zed Predict (Cloud)
Glass’s built-in completion service:- Zeta models - Purpose-built for code completion
- Fast responses - Optimized for low latency
- Context-aware - Uses project structure and recent changes
- Free tier - Limited completions per month
Configuration
Enable Completions
Advanced Settings
Fine-tune completion behavior:Milliseconds to wait before requesting completion
Maximum number of completions to show
Minimum characters before triggering completion
Show completions inline (ghost text)
Show completions in popup menu
Usage
Accepting Completions
- Inline
Inline Completions (Ghost Text)
Gray text appears as you type:- Tab - Accept entire suggestion
- Cmd/Ctrl-Right - Accept word by word
- Escape - Dismiss suggestion
Keyboard Shortcuts
| Action | macOS | Linux/Windows |
|---|---|---|
| Accept completion | tab | tab |
| Accept word | cmd-right | ctrl-right |
| Next suggestion | cmd-] | ctrl-] |
| Previous suggestion | cmd-[ | ctrl-[ |
| Dismiss | escape | escape |
| Trigger manually | cmd-space | ctrl-space |
How It Works
Context Gathering
Completions use multiple context sources:Trigger Conditions
Completions are triggered when:- Typing continues for several characters
- After specific syntax (e.g.,
.,->,::) - When pausing after incomplete code
- Manually via
cmd-space/ctrl-space
Ranking
Suggestions are ranked by:- Relevance - How well they match context
- Confidence - Model certainty
- User patterns - Your coding style
- Recency - Recent similar code
Zed Predict (Cloud)
Glass’s hosted completion service:Features
- Zeta models - Specialized for code completion
- Fast inference - Sub-100ms latency
- Smart caching - Reuses context across requests
- Privacy-focused - Code is not stored long-term
Setup
Data Privacy
What data is sent?
What data is sent?
- Current file content (truncated to relevant sections)
- Cursor position
- Recent edit history
- File type and language
- Project context (optional)
How is data used?
How is data used?
- Generate completions for your request
- Improve model quality (aggregated, anonymized)
- Never shared with third parties
- Automatically deleted after processing
Can I opt out?
Can I opt out?
Yes, switch to Ollama or another local provider:
Ollama Setup
Run completions locally with Ollama:Install Ollama
Download from ollama.ai
Recommended Models
| Model | Size | Quality | Speed | Best For |
|---|---|---|---|---|
| codellama:7b | 4GB | Good | Fast | General code |
| deepseek-coder:6.7b | 4GB | Better | Fast | Multi-language |
| starcoder:7b | 4GB | Good | Fast | Python, JS |
| codellama:13b | 8GB | Better | Medium | Quality over speed |
| deepseek-coder:33b | 19GB | Best | Slow | Best quality |
OpenAI-Compatible APIs
Use any OpenAI-compatible endpoint:Configuration
Format for completion requests:
fim- Fill-in-the-middle (default)chat- Chat completion formatraw- Raw completion format
Compatible Services
- OpenAI API - Official OpenAI endpoint
- Azure OpenAI - Microsoft Azure deployment
- Together AI - Hosted model inference
- Replicate - Model hosting platform
- Hugging Face - Inference API
- LM Studio - Local server
- Text Generation WebUI - Self-hosted
Language Support
Completions work best with:TypeScript
Excellent support
Python
Excellent support
Rust
Excellent support
Go
Excellent support
JavaScript
Excellent support
Java
Good support
C++
Good support
C#
Good support
Performance Tuning
Reduce Latency
Improve Quality
Balance Speed and Quality
Troubleshooting
No completions appearing
No completions appearing
- Check
edit_predictions.enabledistrue - Verify provider is configured correctly
- Ensure API key is valid (for cloud providers)
- Check Ollama is running (for local)
- Review logs:
cmd-shift-p→ “Open Logs”
Slow completions
Slow completions
- Try a smaller model (Ollama)
- Increase
debounce_ms - Check network latency (cloud providers)
- Reduce context size
- Use local provider instead of cloud
Poor quality suggestions
Poor quality suggestions
- Use a larger/better model
- Provide more context (open related files)
- Adjust temperature/top_p parameters
- Try different provider
Ollama connection failed
Ollama connection failed
- Verify Ollama is running:
ollama list - Check API URL in settings
- Ensure model is pulled:
ollama pull [model] - Check firewall settings