Skip to main content

Testing Guide

Qwen Code uses a comprehensive testing strategy with unit tests, integration tests, and benchmarks to ensure code quality and reliability.

Overview

The project includes:
  • Unit Tests: Fast, isolated tests for individual components
  • Integration Tests: End-to-end tests validating complete workflows
  • Benchmarks: Performance tests for terminal operations
  • Sandbox Matrix: Tests across different sandboxing configurations

Test Framework

Vitest

All tests use Vitest as the test runner:
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';

describe('MyComponent', () => {
  beforeEach(() => {
    // Setup
  });

  afterEach(() => {
    // Cleanup
  });

  it('should do something', () => {
    expect(result).toBe(expected);
  });
});

Configuration

Test configuration in vitest.config.ts:
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    globals: true,
    environment: 'node',
    coverage: {
      provider: 'v8',
      reporter: ['text', 'json', 'html'],
    },
  },
});

Unit Tests

Running Unit Tests

# Run all unit tests
npm run test

# Run tests for specific package
npm run test --workspace=packages/core
npm run test --workspace=packages/cli

# Run tests in watch mode
npm run test -- --watch

# Run tests with coverage
npm run test -- --coverage

Test File Location

Unit tests are co-located with source files:
packages/core/src/
├── tools/
│   ├── read-file.ts
│   ├── read-file.test.ts       # Unit test
│   ├── write-file.ts
│   └── write-file.test.ts
├── config/
│   ├── config.ts
│   └── config.test.ts
└── ...

Writing Unit Tests

Basic Test Structure:
// packages/core/src/tools/my-tool.test.ts
import { describe, it, expect, beforeEach } from 'vitest';
import { MyTool } from './my-tool.js';
import type { Config } from '../config/config.js';

describe('MyTool', () => {
  let tool: MyTool;
  let mockConfig: Config;

  beforeEach(() => {
    mockConfig = createMockConfig();
    tool = new MyTool(mockConfig);
  });

  describe('parameter validation', () => {
    it('should reject empty parameters', () => {
      const params = { required: '' };
      const error = tool.validateParams(params);
      expect(error).toBeTruthy();
      expect(error).toContain('must not be empty');
    });

    it('should accept valid parameters', () => {
      const params = { required: 'value' };
      const error = tool.validateParams(params);
      expect(error).toBeNull();
    });
  });

  describe('execution', () => {
    it('should execute successfully with valid params', async () => {
      const params = { required: 'value' };
      const invocation = tool.invoke(params);
      const result = await invocation.execute(
        new AbortController().signal,
      );

      expect(result.error).toBeUndefined();
      expect(result.llmContent).toContain('Success');
    });

    it('should handle errors gracefully', async () => {
      const params = { required: 'invalid' };
      const invocation = tool.invoke(params);
      const result = await invocation.execute(
        new AbortController().signal,
      );

      expect(result.error).toBeDefined();
      expect(result.error.type).toBe('EXECUTION_ERROR');
    });
  });
});

Mocking

File System Mocking

Use memfs or mock-fs for file system operations:
import { vi } from 'vitest';
import { vol } from 'memfs';

vi.mock('fs/promises', () => vol.promises);

beforeEach(() => {
  vol.reset();
  vol.fromJSON({
    '/test/file.txt': 'content',
    '/test/dir/nested.js': 'code',
  });
});

HTTP Mocking

Use msw for HTTP request mocking:
import { http, HttpResponse } from 'msw';
import { setupServer } from 'msw/node';

const server = setupServer(
  http.get('https://api.example.com/data', () => {
    return HttpResponse.json({ data: 'mocked' });
  }),
);

beforeAll(() => server.listen());
afterEach(() => server.resetHandlers());
afterAll(() => server.close());

Function Mocking

Use Vitest’s built-in mocking:
import { vi } from 'vitest';

// Mock a module
vi.mock('./module', () => ({
  functionName: vi.fn(() => 'mocked'),
}));

// Mock a function
const mockFn = vi.fn();
mockFn.mockReturnValue('result');
mockFn.mockResolvedValue('async result');

// Assertions
expect(mockFn).toHaveBeenCalled();
expect(mockFn).toHaveBeenCalledWith('arg');
expect(mockFn).toHaveBeenCalledTimes(2);

Test Utilities

Shared utilities in packages/test-utils:
import { createMockConfig } from '@qwen-code/test-utils';

const config = createMockConfig({
  targetDir: '/test',
  model: 'qwen-coder-plus',
});

Integration Tests

Overview

Integration tests validate end-to-end functionality by:
  • Building the project
  • Running the CLI binary
  • Executing real workflows
  • Verifying file system changes

Running Integration Tests

# Run all integration tests (all sandbox modes)
npm run test:integration:all

# Run E2E tests (no sandbox)
npm run test:e2e

# Run with specific sandbox mode
npm run test:integration:sandbox:none
npm run test:integration:sandbox:docker
npm run test:integration:sandbox:podman

Running Specific Tests

# Run specific test files
npm run test:e2e list_directory write_file

# Run single test by name
npm run test:e2e -- --test-name-pattern "reads a file"

# Run with verbose output
VERBOSE=true npm run test:e2e

# Keep test output for inspection
KEEP_OUTPUT=true npm run test:e2e

Test Structure

Integration tests are in the integration-tests/ directory:
integration-tests/
├── tools/
│   ├── read_file.test.js
│   ├── write_file.test.js
│   └── edit.test.js
├── commands/
│   ├── help.test.js
│   └── settings.test.js
├── sdk-typescript/
│   └── basic.test.js
└── test-utils.js

Writing Integration Tests

// integration-tests/tools/my_tool.test.js
import { describe, it, expect } from 'vitest';
import { runQwenCode, createTestProject } from '../test-utils.js';

describe('MyTool Integration', () => {
  it('should execute tool successfully', async () => {
    // Create test project
    const project = await createTestProject({
      'file.txt': 'content',
    });

    // Run Qwen Code with tool
    const result = await runQwenCode({
      cwd: project.dir,
      input: 'Use my_tool on file.txt',
      timeout: 30000,
    });

    // Verify output
    expect(result.stdout).toContain('Success');
    expect(result.exitCode).toBe(0);

    // Verify file changes
    const content = await project.readFile('file.txt');
    expect(content).toContain('expected');

    // Cleanup
    await project.cleanup();
  });
});

Sandbox Matrix Testing

Tests run across different sandbox configurations:
  • None: No sandboxing (fastest)
  • Docker: Docker-based sandbox
  • Podman: Podman-based sandbox
This ensures compatibility across all environments.

Benchmarks

Terminal Benchmarks

Performance tests for terminal operations:
# Run all benchmarks
npm run test:terminal-bench

# Run Oracle benchmarks (reference implementation)
npm run test:terminal-bench:oracle

# Run Qwen benchmarks
npm run test:terminal-bench:qwen

Writing Benchmarks

import { describe, bench } from 'vitest';

describe('Performance', () => {
  bench('operation', () => {
    // Code to benchmark
    performOperation();
  });

  bench('async operation', async () => {
    await performAsyncOperation();
  });
});

CI/CD Testing

Preflight Checks

Before submitting a PR, run the full check:
npm run preflight
This runs:
  1. Clean build artifacts
  2. Fresh npm ci install
  3. Code formatting check
  4. Linting (zero warnings)
  5. Full build
  6. Type checking
  7. All unit tests
  8. Script tests

CI Pipeline

GitHub Actions runs:
# .github/workflows/test.yml
jobs:
  test:
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        node: ['20.19.0']
    steps:
      - run: npm ci
      - run: npm run lint:ci
      - run: npm run build
      - run: npm run typecheck
      - run: npm run test:ci

Test Coverage

Generating Coverage Reports

# Generate coverage
npm run test -- --coverage

# View HTML report
open coverage/index.html

Coverage Goals

  • Overall: >80% coverage
  • Critical paths: >90% coverage
  • New features: 100% coverage required

Testing Best Practices

General Guidelines

  1. Fast Tests: Unit tests should run in milliseconds
  2. Isolated: Tests should not depend on each other
  3. Deterministic: Same input = same output
  4. Clear Names: Describe what is being tested
  5. Single Assertion: One concept per test

Naming Conventions

// Good: Describes the scenario and expected outcome
it('should reject empty file paths', () => {});
it('should return error when file not found', () => {});
it('should execute tool successfully with valid params', () => {});

// Bad: Vague or implementation-focused
it('works', () => {});
it('test1', () => {});
it('calls validateParams()', () => {});

Test Organization

describe('ToolName', () => {
  describe('constructor', () => {
    // Constructor tests
  });

  describe('parameter validation', () => {
    it('should accept valid parameters', () => {});
    it('should reject invalid parameters', () => {});
  });

  describe('execution', () => {
    it('should execute successfully', () => {});
    it('should handle errors', () => {});
  });

  describe('edge cases', () => {
    it('should handle empty input', () => {});
    it('should handle large files', () => {});
  });
});

Async Testing

// Use async/await
it('should handle async operations', async () => {
  const result = await asyncFunction();
  expect(result).toBe('expected');
});

// Test timeouts
it('should timeout after 5 seconds', async () => {
  await expect(slowOperation()).rejects.toThrow('timeout');
}, 6000);

// Test abort signals
it('should cancel on abort', async () => {
  const controller = new AbortController();
  const promise = longRunningOperation(controller.signal);
  controller.abort();
  await expect(promise).rejects.toThrow('aborted');
});

Error Testing

it('should throw specific error', () => {
  expect(() => dangerousFunction()).toThrow('Expected error');
});

it('should handle async errors', async () => {
  await expect(asyncFunction()).rejects.toThrow('Error message');
});

it('should return error result', async () => {
  const result = await tool.execute(params);
  expect(result.error).toBeDefined();
  expect(result.error.type).toBe('SPECIFIC_ERROR');
});

Debugging Tests

Running Single Tests

# Run specific file
npm run test packages/core/src/tools/read-file.test.ts

# Run tests matching pattern
npm run test -- --grep "read file"

# Run in watch mode
npm run test -- --watch packages/core/src/tools/

Debug Output

import { createDebugLogger } from '../utils/debugLogger.js';

const debugLogger = createDebugLogger('TEST');

it('should debug test', () => {
  debugLogger.debug('Test value:', value);
  // Test continues...
});

VS Code Debugging

Add to .vscode/launch.json:
{
  "type": "node",
  "request": "launch",
  "name": "Debug Tests",
  "runtimeExecutable": "npm",
  "runtimeArgs": ["run", "test", "--", "${file}"],
  "console": "integratedTerminal",
  "internalConsoleOptions": "neverOpen"
}

Troubleshooting

Common Issues

Tests Timeout:
// Increase timeout
it('slow test', async () => {
  // test code
}, 10000);  // 10 seconds
Flaky Tests:
// Use proper cleanup
afterEach(async () => {
  await cleanup();
  vi.restoreAllMocks();
});
File System Issues:
// Ensure temp directory cleanup
import os from 'os';
import path from 'path';
import fs from 'fs/promises';

let tmpDir: string;

beforeEach(async () => {
  tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), 'test-'));
});

afterEach(async () => {
  await fs.rm(tmpDir, { recursive: true, force: true });
});

Next Steps