Skip to main content
Glass provides AI-powered code completions that suggest code as you type, helping you write code faster and with fewer errors.

Overview

AI completions in Glass:
  • Real-time suggestions - Appear as you type
  • Context-aware - Uses surrounding code and project structure
  • Multi-line support - Suggests entire functions or blocks
  • Multiple providers - Choose between Zed Cloud, Ollama, or custom endpoints
Code completions are also called “edit predictions” or “autocomplete” in the codebase.

Completion Providers

Zed Predict (Cloud)

Glass’s built-in completion service:
  • Zeta models - Purpose-built for code completion
  • Fast responses - Optimized for low latency
  • Context-aware - Uses project structure and recent changes
  • Free tier - Limited completions per month
Setup: Requires Zed account (automatic authentication)

Configuration

Enable Completions

{
  "edit_predictions": {
    "provider": "zed_cloud",
    "enabled": true
  }
}

Advanced Settings

Fine-tune completion behavior:
{
  "edit_predictions": {
    "enabled": true,
    "provider": "zed_cloud",
    "debounce_ms": 300,
    "max_completions": 3,
    "min_trigger_length": 3,
    "show_inline": true,
    "show_in_menu": true
  }
}
debounce_ms
number
default:300
Milliseconds to wait before requesting completion
max_completions
number
default:3
Maximum number of completions to show
min_trigger_length
number
default:3
Minimum characters before triggering completion
show_inline
boolean
default:true
Show completions inline (ghost text)
show_in_menu
boolean
default:true
Show completions in popup menu

Usage

Accepting Completions

Inline Completions (Ghost Text)

Gray text appears as you type:
  • Tab - Accept entire suggestion
  • Cmd/Ctrl-Right - Accept word by word
  • Escape - Dismiss suggestion
Inline suggestions preview the completion without committing.

Keyboard Shortcuts

ActionmacOSLinux/Windows
Accept completiontabtab
Accept wordcmd-rightctrl-right
Next suggestioncmd-]ctrl-]
Previous suggestioncmd-[ctrl-[
Dismissescapeescape
Trigger manuallycmd-spacectrl-space

How It Works

Context Gathering

Completions use multiple context sources:
1

Current File

Content before and after cursor position
2

Recent Edits

Your recent changes in the file
3

Related Files

Semantically related files in the project
4

Project Structure

Directory layout and file organization

Trigger Conditions

Completions are triggered when:
  • Typing continues for several characters
  • After specific syntax (e.g., ., ->, ::)
  • When pausing after incomplete code
  • Manually via cmd-space / ctrl-space

Ranking

Suggestions are ranked by:
  1. Relevance - How well they match context
  2. Confidence - Model certainty
  3. User patterns - Your coding style
  4. Recency - Recent similar code

Zed Predict (Cloud)

Glass’s hosted completion service:

Features

  • Zeta models - Specialized for code completion
  • Fast inference - Sub-100ms latency
  • Smart caching - Reuses context across requests
  • Privacy-focused - Code is not stored long-term

Setup

1

Sign In

Sign in to your Zed account (automatic if using Glass)
2

Enable

Completions are enabled by default
3

Usage Limits

Free tier includes limited completions per month
4

Upgrade

Upgrade for unlimited completions

Data Privacy

  • Current file content (truncated to relevant sections)
  • Cursor position
  • Recent edit history
  • File type and language
  • Project context (optional)
  • Generate completions for your request
  • Improve model quality (aggregated, anonymized)
  • Never shared with third parties
  • Automatically deleted after processing
Yes, switch to Ollama or another local provider:
{
  "edit_predictions": {
    "provider": "ollama"
  }
}

Ollama Setup

Run completions locally with Ollama:
1

Install Ollama

Download from ollama.ai
# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh
2

Pull a Model

Download a code model:
ollama pull codellama:7b
# or
ollama pull deepseek-coder:6.7b
# or
ollama pull starcoder:7b
3

Configure Glass

Update settings:
{
  "edit_predictions": {
    "provider": "ollama",
    "ollama": {
      "api_url": "http://localhost:11434",
      "model": "codellama:7b"
    }
  }
}
4

Verify

Start typing code - completions should appear
ModelSizeQualitySpeedBest For
codellama:7b4GBGoodFastGeneral code
deepseek-coder:6.7b4GBBetterFastMulti-language
starcoder:7b4GBGoodFastPython, JS
codellama:13b8GBBetterMediumQuality over speed
deepseek-coder:33b19GBBestSlowBest quality

OpenAI-Compatible APIs

Use any OpenAI-compatible endpoint:

Configuration

{
  "edit_predictions": {
    "provider": "openai_compatible_api",
    "openai_compatible_api": {
      "api_url": "https://api.example.com/v1",
      "api_key": "your-api-key",
      "model": "your-model",
      "prompt_format": "fim"
    }
  }
}
prompt_format
string
Format for completion requests:
  • fim - Fill-in-the-middle (default)
  • chat - Chat completion format
  • raw - Raw completion format

Compatible Services

  • OpenAI API - Official OpenAI endpoint
  • Azure OpenAI - Microsoft Azure deployment
  • Together AI - Hosted model inference
  • Replicate - Model hosting platform
  • Hugging Face - Inference API
  • LM Studio - Local server
  • Text Generation WebUI - Self-hosted

Language Support

Completions work best with:

TypeScript

Excellent support

Python

Excellent support

Rust

Excellent support

Go

Excellent support

JavaScript

Excellent support

Java

Good support

C++

Good support

C#

Good support
Other languages are supported but may have variable quality.

Performance Tuning

Reduce Latency

Use local models - Ollama eliminates network latency
Adjust debounce - Lower debounce_ms for faster suggestions
{ "edit_predictions": { "debounce_ms": 200 } }

Improve Quality

Use larger models - Better results with more capable models
Provide more context - Keep related files open

Balance Speed and Quality

{
  "edit_predictions": {
    "provider": "ollama",
    "ollama": {
      "model": "codellama:7b",
      "options": {
        "temperature": 0.2,
        "top_p": 0.9,
        "num_predict": 50
      }
    }
  }
}

Troubleshooting

  • Check edit_predictions.enabled is true
  • Verify provider is configured correctly
  • Ensure API key is valid (for cloud providers)
  • Check Ollama is running (for local)
  • Review logs: cmd-shift-p → “Open Logs”
  • Try a smaller model (Ollama)
  • Increase debounce_ms
  • Check network latency (cloud providers)
  • Reduce context size
  • Use local provider instead of cloud
  • Use a larger/better model
  • Provide more context (open related files)
  • Adjust temperature/top_p parameters
  • Try different provider
  • Verify Ollama is running: ollama list
  • Check API URL in settings
  • Ensure model is pulled: ollama pull [model]
  • Check firewall settings

Next Steps

Build docs developers (and LLMs) love