Skip to main content
File Identification is HAI Build’s intelligent file discovery system. It combines fast indexing, fuzzy search, and context tracking to help the AI find exactly the files it needs for code generation.
File identification and search

How It Works

File Identification uses a multi-layer approach to file discovery:
1

Workspace Indexing

Ripgrep scans your workspace to build a comprehensive file index.
2

Active File Tracking

Monitors currently open tabs and recently edited files.
3

Fuzzy Search

Uses Fuse.js for intelligent file matching across paths and names.
4

Context Prioritization

Ranks files based on relevance to the current task.

Core Features

Lightning-Fast Indexing

File Identification leverages ripgrep for blazing-fast file discovery:
// From src/services/search/file-search.ts:17
export async function executeRipgrepForFiles(
  workspacePath: string,
  limit: number = 5000,
): Promise<{ path: string; type: "file" | "folder"; label?: string }[]>
Performance characteristics:
  • Scans 5,000+ files in milliseconds
  • Follows symlinks automatically
  • Respects .gitignore patterns
  • Excludes common build directories (node_modules, dist, .git)
Ripgrep is 10-100x faster than traditional file search tools, making it ideal for large codebases.

Intelligent File Filtering

Automatic exclusion of irrelevant directories:
// Excluded patterns (src/services/search/file-search.ts:30)
const excludePatterns = [
  "node_modules",
  ".git",
  ".github",
  "out",
  "dist",
  "__pycache__",
  ".venv",
  ".env",
  "venv",
  "env",
  ".cache",
  "tmp",
  "temp"
]

Active File Prioritization

Files you’re actively working on appear first:
// From src/services/search/file-search.ts:101
async function getActiveFiles(): Promise<Set<string>> {
  const request = GetOpenTabsRequest.create({})
  const response = await HostProvider.window.getOpenTabs(request)
  return new Set(response.paths)
}
Active files are:
  • Currently open in tabs
  • Recently edited
  • Visible in the editor
  • Part of the current diff view
The AI sees your open files first, making it context-aware of what you’re working on.
File Identification uses Fuse.js for intelligent fuzzy matching:
Files are ranked by multiple factors:
// From src/services/search/file-search.ts:159
const fzf = new fzfModule.Fzf(combinedItems, {
  selector: (item) => `${item.label} ${item.label} ${item.path}`,
  tiebreakers: [OrderbyMatchScore, fzfModule.byLengthAsc],
  limit: limit * 2,
})
Scoring criteria:
  1. Filename matches (weighted 2x)
  2. Full path matches
  3. Fewer gaps between matched characters
  4. Shorter paths preferred

Multi-Root Workspace Support

File Identification seamlessly handles multi-root workspaces:
// From src/services/search/file-search.ts:213
export async function searchWorkspaceFilesMultiroot(
  query: string,
  workspaceManager: WorkspaceRootManager,
  limit: number = 20,
  selectedType?: "file" | "folder",
  workspaceHint?: string,
): Promise<Result[]>

Workspace Hints

Search specific workspace roots:@frontend:/componentsSearches only the frontend workspace.

Parallel Search

Searches all roots simultaneously:
const searchPromises = workspacesToSearch.map(
  async (workspace) => {
    return await searchWorkspaceFiles(
      query,
      workspace.path,
      limit,
      selectedType,
      workspace.name
    )
  }
)

Deduplication

Handles duplicate filenames:
if (pathCounts.get(result.path)! > 1) {
  return {
    ...result,
    label: `${result.workspaceName}:/${result.path}`
  }
}

Workspace Labels

Each result includes workspace name:
{
  path: "src/components/Button.tsx",
  workspaceName: "frontend",
  type: "file"
}
See: src/services/search/file-search.ts:209

File Context Tracking

HAI Build tracks which files are in the AI’s context:

Context Status Indicators

1

Added to Context

Files explicitly added via mentions or tool use.
2

In Active Use

Files currently being read or edited by the AI.
3

Removed from Context

Files removed during context compaction.

File Mention Tracking

Every file operation is tracked:
// From src/core/context/context-tracking/FileContextTracker.ts
export class FileContextTracker {
  async trackFileAccess(filePath: string, operation: 'read' | 'write' | 'list')
  async getAccessedFiles(): Promise<Set<string>>
  async clearAccessHistory(): Promise<void>
}
Tracked operations:
  • read_file: File content read
  • write_to_file: File created or modified
  • replace_in_file: File edited
  • list_files: Directory listing
  • search_files: File search results

Context Warnings

Users are warned when files are removed:
// From src/core/task/index.ts:1248
const pendingContextWarning = 
  await this.fileContextTracker.retrieveAndClearPendingFileContextWarning()

if (pendingContextWarning && pendingContextWarning.length > 0) {
  const fileContextWarning = formatResponse.fileContextWarning(pendingContextWarning)
  newUserContent.push({
    type: "text",
    text: fileContextWarning,
  })
}
When context is compacted, the AI receives a warning about removed files to maintain awareness.

File Type Detection

Automatic file type detection and categorization:
Verifies file types using fs.lstat():
// From src/services/search/file-search.ts:168
const verifiedResultsPromises = filteredResults.map(
  async ({ item }) => {
    const fullPath = path.join(workspacePath, item.path)
    let type = item.type
    
    try {
      const stats = await fs.promises.lstat(fullPath)
      type = stats.isDirectory() ? "folder" : "file"
    } catch {
      // Keep original type if path doesn't exist
    }
    
    return { ...item, type }
  },
)
Automatically builds directory tree:
// From src/services/search/file-search.ts:61
let dirPath = path.dirname(relativePath)
while (dirPath && dirPath !== "." && dirPath !== "/") {
  dirSet.add(dirPath)
  dirPath = path.dirname(dirPath)
}
Enables folder-level operations and navigation.

Search Strategies

Different search approaches for different scenarios:

Exact Match

When you know the filename:UserService.tsReturns exact matches first.

Partial Match

When you remember part of the name:user servFuzzy matches across path segments.

Path-Based

When you know the directory:components/authSearches full paths.

Extension Filter

When you need specific file types:*.test.tsFilters by extension pattern.

Integration with AI

Automatic File Suggestions

The AI uses File Identification to:
  1. Find relevant files for a task
  2. Suggest imports based on available modules
  3. Locate configuration files automatically
  4. Discover test files for testing tasks
  5. Navigate project structure efficiently

Context-Aware Mentions

File Identification powers the @ mention feature:
// Type @ in chat to trigger file search
@src/components/Button
// AI adds file to context
Use @ mentions to explicitly add files to context before describing your task.

Performance Optimization

Lazy Loading

Results are loaded incrementally:
  • Initial 20 results displayed immediately
  • More results loaded on scroll
  • Prevents UI blocking on large workspaces

Caching Strategy

File index is cached per workspace:
  • Cache invalidated on file system changes
  • Active files cache updated every 500ms
  • Search results cached for 5 seconds

Limits and Throttling

// Default limits (src/services/search/file-search.ts:18)
const DEFAULT_FILE_LIMIT = 5000
const DEFAULT_SEARCH_RESULTS = 20
const EXTENDED_RESULTS = 40  // For multi-root
Prevents memory issues on massive codebases.

Configuration

Custom Exclusions

Add workspace-specific exclusions:
// .vscode/settings.json
{
  "hai.fileSearch.exclude": [
    "**/build/**",
    "**/coverage/**",
    "**/*.log"
  ]
}

Search Behavior

Customize search parameters:
{
  "hai.fileSearch.limit": 10000,
  "hai.fileSearch.fuzzyThreshold": 0.3,
  "hai.fileSearch.followSymlinks": true
}

Best Practices

  • Clear names improve search accuracy
  • Include feature/component in filename
  • Use consistent naming conventions
  • Avoid generic names like utils.ts
  • Group related files together
  • Use feature-based folder structure
  • Keep directory depth reasonable (3-5 levels)
  • Mirror logical application structure
  • Open relevant files before starting AI tasks
  • Keep related files in tabs
  • Use split editor for context files
  • Close unrelated files to reduce noise
  • Use workspace hints in multi-root setups
  • Add irrelevant directories to exclusions
  • Increase file limit if needed
  • Consider breaking into multiple workspaces

Troubleshooting

  • Use more specific search terms
  • Include path segments in query
  • Adjust fuzzy search threshold
  • Try exact filename match

Next Steps

Inline Editing

Make quick edits to discovered files

AI-Powered Coding

See how the AI uses file context

Focus Chain

Track which files are being worked on

Multi-Root Workspaces

Work with multiple project roots

Build docs developers (and LLMs) love