Skip to main content

Overview

The /compress command uses AI to generate a concise summary of your conversation history and replaces the full context with this summary. This reduces token usage while preserving important information, allowing you to continue longer conversations.

Usage

In Interactive Mode

qwen
> /compress

Alternative Names

The following aliases are available:
  • /compress
  • /summarize

In Non-Interactive Mode

qwen --prompt "/compress"

What It Does

When you run /compress, the command:
  1. Analyzes History: Reviews your full conversation context
  2. Generates Summary: Creates a comprehensive but concise summary
  3. Replaces Context: Swaps the detailed history with the summary
  4. Preserves Continuity: Maintains enough context to continue the conversation
  5. Reduces Tokens: Significantly decreases token count

How It Works

Before Compression

Tokens: 8,500 / 10,000

[Message 1] User: Help me build a React app
[Response 1] AI: Let me help you set up React...
[Message 2] User: Add routing
[Response 2] AI: Here's how to add React Router...
[Message 3] User: Add authentication  
[Response 3] AI: Let's implement authentication...
... (50 more messages)

After Compression

Tokens: 1,200 / 10,000

[Summary] Built a React application with:
- Basic project setup with TypeScript
- React Router for navigation (Home, About, Login pages)
- JWT-based authentication system
- Protected routes for authenticated users
- User profile page with edit capabilities

Next: Planning to add state management with Redux

Continuing After Compression

You: Add Redux for state management
AI: Based on our authentication system, let me add Redux...
The AI maintains context from the summary.

When to Use

Approaching Token Limits

> /stats
Tokens: 8,500 / 10,000

> /compress
Context compressed (8,500  1,200 tokens)

Long Conversations

After extended discussions:
qwen
> Help me build a full-stack application
...
# After 30+ messages
> /compress
> Now let's add the admin panel
Preserve context while starting a new phase:
> /compress
> Now let's refactor what we built

Before Major Changes

Compress before requesting large code changes:
> /compress
> Rewrite the entire authentication system

Output Example

Interactive Mode

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⚙️  Compressing conversation context...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✓ Context compressed successfully

  Original tokens:  8,524
  New tokens:       1,234
  Reduction:        85.5%
  
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

JSON Output

qwen --prompt "/compress" --output-format json
{
  "type": "result",
  "isError": false,
  "summary": "Context compressed (8524 -> 1234)",
  "compression": {
    "originalTokenCount": 8524,
    "newTokenCount": 1234,
    "reductionPercent": 85.5,
    "compressionStatus": "success"
  }
}

Compression Quality

The compression algorithm preserves: Key Decisions: Important choices made during the conversation
Code Structure: Architecture and implementation details
Current State: What has been completed
Next Steps: Planned or discussed next actions
Context: Why certain approaches were chosen
What may be lost: ⚠️ Exact Wording: Specific phrasing of questions
⚠️ Failed Attempts: Solutions that didn’t work
⚠️ Full Code: Only key snippets are preserved
⚠️ Minor Details: Small clarifications or tangents

Compress vs Clear

# When continuing the same topic
qwen
> Build a REST API with Express
...
> /compress  # Preserve API context
> Now add authentication

Decision Matrix

ScenarioUse CompressUse Clear
Continue same project
Switch to new project
Near token limit
Need fresh context
Preserve decisions
Remove sensitive data

Multiple Compressions

You can compress multiple times:
> /stats
Tokens: 9,000

> /compress
Tokens: 1,500

# Continue conversation...
# Much later:

> /stats  
Tokens: 8,000

> /compress
Tokens: 1,200
Each compression summarizes all context, including previous summaries.

Automatic Compression

Qwen Code can automatically compress when needed:
// settings.json
{
  "advanced": {
    "autoCompress": true,
    "autoCompressThreshold": 0.85  // At 85% of token limit
  }
}
With auto-compress enabled:
qwen
> Continue building features...
...
# Automatic notification:
⚙️ Context automatically compressed (8,500  1,200 tokens)

Failed Compression

If compression fails:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ Failed to compress chat history
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Try:
- Using /clear to start fresh
- Resuming a different session
- Checking your API connection
Common causes:
  • Network issues
  • API rate limits
  • Very short conversations (nothing to compress)

Integration Examples

CI/CD Pipeline

#!/bin/bash
# Long-running AI assistant script

qwen --prompt "Start building the project"

# After many steps
qwen --prompt "/compress" 

qwen --prompt "Continue with deployment"

Scheduled Compression

#!/bin/bash
# Compress every 20 messages

MESSAGE_COUNT=0

while true; do
  MESSAGE_COUNT=$((MESSAGE_COUNT + 1))
  
  if [ $((MESSAGE_COUNT % 20)) -eq 0 ]; then
    echo "/compress"
  fi
  
  # Read user input
  read -r input
  echo "$input"
done | qwen

Performance Impact

Compression typically takes:
  • Small conversations (< 5,000 tokens): 2-5 seconds
  • Medium conversations (5,000-10,000 tokens): 5-10 seconds
  • Large conversations (> 10,000 tokens): 10-20 seconds
During compression:
  • The CLI shows a progress indicator
  • You cannot send new messages
  • Use Esc to cancel if needed

Best Practices

Don’t wait until you hit the token limit:
# Good: Compress at 80%
> /stats
Tokens: 8,000 / 10,000
> /compress

# Avoid: Waiting until 95%
> /stats  
Tokens: 9,500 / 10,000
> /compress  # May fail if too close to limit
Monitor tokens regularly:
> /stats
Compress when you’re over 70-80% of the limit.
Compress at natural breakpoints:
  • After completing a feature
  • Before starting a new phase
  • After resolving a complex issue
  • Before requesting major changes
For long-term projects, use both:
> /summary  # Create project summary file
> /compress # Compress conversation
This maintains project knowledge across sessions.

Troubleshooting

Nothing to Compress

No conversation found to summarize.
This means:
  • Conversation is too short (< 3 messages)
  • History was recently cleared
  • No meaningful content to summarize

Compression Too Aggressive

If important details are lost:
  1. Make key information explicit:
    > Important: We're using JWT tokens with refresh rotation
    > /compress
    
  2. Use project summary for permanent details:
    > /summary
    

Repeated Compressions Degrading

After many compressions, quality may degrade. Consider:
> /summary  # Save permanent project state
> /clear    # Start fresh with summary file

See Also

/clear Command

Clear conversation for a fresh start

/stats Command

Monitor token usage and statistics

Context Management

Understanding context and tokens

Session Management

Managing sessions and history