/compress Command

Overview

The /compress command uses AI to generate a concise summary of your conversation history and replaces the full context with this summary. This reduces token usage while preserving important information, allowing you to continue longer conversations.

Usage

In Interactive Mode

qwen
> /compress

Alternative Names

The following aliases are available:

/compress
/summarize

In Non-Interactive Mode

qwen --prompt "/compress"

What It Does

When you run /compress, the command:

Analyzes History: Reviews your full conversation context
Generates Summary: Creates a comprehensive but concise summary
Replaces Context: Swaps the detailed history with the summary
Preserves Continuity: Maintains enough context to continue the conversation
Reduces Tokens: Significantly decreases token count

How It Works

Before Compression

Tokens: 8,500 / 10,000

[Message 1] User: Help me build a React app
[Response 1] AI: Let me help you set up React...
[Message 2] User: Add routing
[Response 2] AI: Here's how to add React Router...
[Message 3] User: Add authentication  
[Response 3] AI: Let's implement authentication...
... (50 more messages)

After Compression

Tokens: 1,200 / 10,000

[Summary] Built a React application with:
- Basic project setup with TypeScript
- React Router for navigation (Home, About, Login pages)
- JWT-based authentication system
- Protected routes for authenticated users
- User profile page with edit capabilities

Next: Planning to add state management with Redux

Continuing After Compression

You: Add Redux for state management
AI: Based on our authentication system, let me add Redux...

The AI maintains context from the summary.

When to Use

Approaching Token Limits

> /stats
Tokens: 8,500 / 10,000

> /compress
Context compressed (8,500 → 1,200 tokens)

Long Conversations

After extended discussions:

qwen
> Help me build a full-stack application
...
# After 30+ messages
> /compress
> Now let's add the admin panel

Preserve context while starting a new phase:

> /compress
> Now let's refactor what we built

Before Major Changes

Compress before requesting large code changes:

> /compress
> Rewrite the entire authentication system

Output Example

Interactive Mode

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⚙️  Compressing conversation context...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✓ Context compressed successfully

  Original tokens:  8,524
  New tokens:       1,234
  Reduction:        85.5%
  
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

JSON Output

qwen --prompt "/compress" --output-format json

{
  "type": "result",
  "isError": false,
  "summary": "Context compressed (8524 -> 1234)",
  "compression": {
    "originalTokenCount": 8524,
    "newTokenCount": 1234,
    "reductionPercent": 85.5,
    "compressionStatus": "success"
  }
}

Compression Quality

The compression algorithm preserves: ✅ Key Decisions: Important choices made during the conversation
✅ Code Structure: Architecture and implementation details
✅ Current State: What has been completed
✅ Next Steps: Planned or discussed next actions
✅ Context: Why certain approaches were chosen What may be lost: ⚠️ Exact Wording: Specific phrasing of questions
⚠️ Failed Attempts: Solutions that didn’t work
⚠️ Full Code: Only key snippets are preserved
⚠️ Minor Details: Small clarifications or tangents

Compress vs Clear

# When continuing the same topic
qwen
> Build a REST API with Express
...
> /compress  # Preserve API context
> Now add authentication

Decision Matrix

Scenario	Use Compress	Use Clear
Continue same project	✅	❌
Switch to new project	❌	✅
Near token limit	✅	✅
Need fresh context	❌	✅
Preserve decisions	✅	❌
Remove sensitive data	❌	✅

Multiple Compressions

You can compress multiple times:

> /stats
Tokens: 9,000

> /compress
Tokens: 1,500

# Continue conversation...
# Much later:

> /stats  
Tokens: 8,000

> /compress
Tokens: 1,200

Each compression summarizes all context, including previous summaries.

Automatic Compression

Qwen Code can automatically compress when needed:

// settings.json
{
  "advanced": {
    "autoCompress": true,
    "autoCompressThreshold": 0.85  // At 85% of token limit
  }
}

With auto-compress enabled:

qwen
> Continue building features...
...
# Automatic notification:
⚙️ Context automatically compressed (8,500 → 1,200 tokens)

Failed Compression

If compression fails:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ Failed to compress chat history
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Try:
- Using /clear to start fresh
- Resuming a different session
- Checking your API connection

Common causes:

Network issues
API rate limits
Very short conversations (nothing to compress)

Integration Examples

CI/CD Pipeline

#!/bin/bash
# Long-running AI assistant script

qwen --prompt "Start building the project"

# After many steps
qwen --prompt "/compress" 

qwen --prompt "Continue with deployment"

Scheduled Compression

#!/bin/bash
# Compress every 20 messages

MESSAGE_COUNT=0

while true; do
  MESSAGE_COUNT=$((MESSAGE_COUNT + 1))
  
  if [ $((MESSAGE_COUNT % 20)) -eq 0 ]; then
    echo "/compress"
  fi
  
  # Read user input
  read -r input
  echo "$input"
done | qwen

Performance Impact

Compression typically takes:

Small conversations (< 5,000 tokens): 2-5 seconds
Medium conversations (5,000-10,000 tokens): 5-10 seconds
Large conversations (> 10,000 tokens): 10-20 seconds

During compression:

The CLI shows a progress indicator
You cannot send new messages
Use Esc to cancel if needed

Best Practices

Compress Proactively

Don’t wait until you hit the token limit:

# Good: Compress at 80%
> /stats
Tokens: 8,000 / 10,000
> /compress

# Avoid: Waiting until 95%
> /stats  
Tokens: 9,500 / 10,000
> /compress  # May fail if too close to limit

Check Token Usage

Monitor tokens regularly:

> /stats

Compress when you’re over 70-80% of the limit.

Strategic Compression Points

Compress at natural breakpoints:

After completing a feature
Before starting a new phase
After resolving a complex issue
Before requesting major changes

Combine with Project Summary

For long-term projects, use both:

> /summary  # Create project summary file
> /compress # Compress conversation

This maintains project knowledge across sessions.

Troubleshooting

Nothing to Compress

No conversation found to summarize.

This means:

Conversation is too short (< 3 messages)
History was recently cleared
No meaningful content to summarize

Compression Too Aggressive

If important details are lost:

Make key information explicit:

> Important: We're using JWT tokens with refresh rotation
> /compress

Use project summary for permanent details:
```
> /summary
```

Repeated Compressions Degrading

After many compressions, quality may degrade. Consider:

> /summary  # Save permanent project state
> /clear    # Start fresh with summary file

/clear Command

Clear conversation for a fresh start

/stats Command

Monitor token usage and statistics

Context Management

Understanding context and tokens

Session Management

Managing sessions and history

Commands

Options

Overview

Usage

In Interactive Mode

Alternative Names

In Non-Interactive Mode

What It Does

How It Works

Before Compression

After Compression

Continuing After Compression

When to Use

Approaching Token Limits

Long Conversations

Before Major Changes

Output Example

Interactive Mode

JSON Output

Compression Quality

Compress vs Clear

Decision Matrix

Multiple Compressions

Automatic Compression

Failed Compression

Integration Examples

CI/CD Pipeline

Scheduled Compression

Performance Impact

Best Practices

Troubleshooting

Nothing to Compress

Compression Too Aggressive

Repeated Compressions Degrading

See Also

/clear Command

/stats Command

Context Management

Session Management

Commands

Options

​Overview

​Usage

​In Interactive Mode

​Alternative Names

​In Non-Interactive Mode

​What It Does

​How It Works

​Before Compression

​After Compression

​Continuing After Compression

​When to Use

​Approaching Token Limits

​Long Conversations

​Continuing Related Work

​Before Major Changes

​Output Example

​Interactive Mode

​JSON Output

​Compression Quality

​Compress vs Clear

​Decision Matrix

​Multiple Compressions

​Automatic Compression

​Failed Compression

​Integration Examples

​CI/CD Pipeline

​Scheduled Compression

​Performance Impact

​Best Practices

​Troubleshooting

​Nothing to Compress

​Compression Too Aggressive

​Repeated Compressions Degrading

​See Also

/clear Command

/stats Command

Context Management

Session Management

Overview

Usage

In Interactive Mode

Alternative Names

In Non-Interactive Mode

What It Does

How It Works

Before Compression

After Compression

Continuing After Compression

When to Use

Approaching Token Limits

Long Conversations

Continuing Related Work

Before Major Changes

Output Example

Interactive Mode

JSON Output

Compression Quality

Compress vs Clear

Decision Matrix

Multiple Compressions

Automatic Compression

Failed Compression

Integration Examples

CI/CD Pipeline

Scheduled Compression

Performance Impact

Best Practices

Troubleshooting

Nothing to Compress

Compression Too Aggressive

Repeated Compressions Degrading

See Also