Voice Commands

Voice Commands provide quick, context-aware actions you can trigger through voice input, complementing the full Voice Agent experience.

Overview

While the Voice Agent provides conversational AI interaction, Voice Commands offer instant execution of predefined actions through simple voice triggers.

Global Voice Shortcuts

Primary Voice Controls

Voice Agent

Ctrl+Alt+JLaunch the full Voice Agent for conversational AI interaction

Voice Transcription

Ctrl+Alt+TToggle voice-to-text transcription mode

Cycle Transcribe Modes

Ctrl+Shift+TSwitch between paste, typewriter, and buffer modes

Stop Auto-typing

Ctrl+Shift+XImmediately halt any ongoing AI typing

Built-in Voice Actions

These actions are available when the Voice Agent is active:

System Commands

Change Theme

Voice trigger: “Change the theme to [light/dark]”Switches the Tabby interface between light and dark mode.

{
  name: 'changeBrowserTheme',
  type: 'function',
  description: 'Change the browser theme',
  parameters: {
    type: 'object',
    properties: {
      theme: {
        type: 'string',
        enum: ['light', 'dark']
      }
    },
    required: ['theme']
  }
}

End Conversation

Voice trigger: “End conversation” / “Hang up” / “Goodbye”Gracefully ends the voice session and closes the agent.

{
  name: 'endConversation',
  type: 'function',
  description: 'End the current voice conversation',
  parameters: {
    type: 'object',
    properties: {},
    required: []
  }
}

Desktop Automation Commands

When Windows MCP is enabled, these voice commands are available:

Application Control
File & Folder
Web & URLs
Clipboard

Open Applications

voice

“Open Chrome” / “Launch VS Code” / “Start Notepad”Uses Powershell-Tool to launch applications quickly.

Switch Applications

voice

“Switch to Chrome” / “Focus on VS Code”Uses App-Tool to activate running applications.

Close Applications

voice

“Close Chrome” / “Quit Notepad”Uses process management to terminate applications.

Workflow Automation

Create custom voice-triggered workflows by training the Voice Agent:

Describe Your Workflow

Tell the agent about your routine:“Every morning I open Slack, Gmail in Chrome, and VS Code with my project folder.”

Agent Stores as Memory

The agent saves this as a procedural memory with type PROCEDURAL.

Trigger Workflow

Next time, just say:“Prep my workspace” / “Start my day” / “Set up my environment”The agent executes the stored workflow automatically.

Example Workflows

User: "When I say 'dev mode', open VS Code in my projects folder, 
       start the terminal, and open Chrome with localhost:3000"

[Agent stores workflow]

User: "Dev mode"
[Agent executes all steps]

Memory-Based Commands

The Voice Agent remembers context across sessions:

Personal Context

Example: After telling the agent your name is Sarah:User: “What’s my name?”Agent: “Your name is Sarah.”The agent searches memories first before responding to any query.

Preferences

Example: After expressing preference:User: “I prefer dark mode”[Agent stores preference]Later: “Set my theme”[Agent applies dark mode based on stored preference]

Technical Stack

Example: Sharing your tech stack:User: “I work with React, TypeScript, and Tailwind CSS”[Agent stores technical preferences]Later: “Create a new component”[Agent generates React + TypeScript + Tailwind component]

Creating Custom Commands

You can extend voice commands by adding custom tools to the MCP server or creating new functions in the Voice Agent API.

Custom Tool Template

{
  type: 'function',
  name: 'customAction',
  description: 'Description of what this action does',
  parameters: {
    type: 'object',
    properties: {
      param1: {
        type: 'string',
        description: 'Description of parameter'
      }
    },
    required: ['param1']
  }
}

Add your tool to DEFAULT_VOICE_TOOLS array in /lib/ai/voice/index.ts.

Tool Execution Handler

Implement the execution logic in the Voice Agent hook:

case "customAction":
  // Your custom logic here
  console.log(toolArgs.param1);
  // Perform action
  break;

Command Best Practices

For reliable voice command recognition:

Be specific: “Open Chrome” vs “Open browser”
Use natural language: “What’s the weather?” vs “Weather query”
Avoid ambiguity: “Open my main project” (after teaching the agent which project)
Build context: Let the agent learn your preferences over time
Confirm actions: The agent will acknowledge and execute

Keyboard Shortcut Reference

Voice Agent

Ctrl+Alt+J

Transcription

Ctrl+Alt+T

Cycle Modes

Ctrl+Shift+T

Stop Typing

Ctrl+Shift+X

Actions Menu

Ctrl+\

Brain Panel

Ctrl+Shift+B

Troubleshooting

Voice not recognized

Check microphone permissions in browser settings
Verify microphone is selected as default input device
Test microphone in browser console: navigator.mediaDevices.getUserMedia({ audio: true })
Reduce background noise

Commands not executing

Ensure Windows MCP server is running (for desktop automation)
Check API keys are properly configured
Verify memory backend is accessible at localhost:8000
Review browser console for error messages

Agent doesn't remember

Confirm memory backend is running
Check Supabase connection for vector storage
Verify Neo4j connection (if using knowledge graph)
Try explicitly saying “Remember that” after sharing information

Requirements

Voice Commands require:

OpenAI API key (for Voice Agent)
Groq API key (for Transcription)
Microphone permissions
(Optional) Windows MCP server for desktop automation
(Optional) Memory backend for persistent context

Voice Agent

Full conversational AI voice assistant

Voice Transcription

Speech-to-text for typing in any app

Action Menu

Quick AI actions triggered by keyboard

Brain Panel

Memory dashboard and management

Interview Copilot

AI Assistance

Memory & Brain

Voice Features

Automation

Overview

Global Voice Shortcuts

Primary Voice Controls

Voice Agent

Voice Transcription

Cycle Transcribe Modes

Stop Auto-typing

Built-in Voice Actions

System Commands

Desktop Automation Commands

Workflow Automation

Example Workflows

Memory-Based Commands

Creating Custom Commands

Custom Tool Template

Tool Execution Handler

Command Best Practices

Keyboard Shortcut Reference

Voice Agent

Transcription

Cycle Modes

Stop Typing

Actions Menu

Brain Panel

Troubleshooting

Requirements

Voice Agent

Voice Transcription

Action Menu

Brain Panel

Build docs developers (and LLMs) love

Interview Copilot

AI Assistance

Memory & Brain

Voice Features

Automation

​Overview

​Global Voice Shortcuts

​Primary Voice Controls

Voice Agent

Voice Transcription

Cycle Transcribe Modes

Stop Auto-typing

​Built-in Voice Actions

​System Commands

​Desktop Automation Commands

​Workflow Automation

​Example Workflows

​Memory-Based Commands

​Creating Custom Commands

​Custom Tool Template

​Tool Execution Handler

​Command Best Practices

​Keyboard Shortcut Reference

Voice Agent

Transcription

Cycle Modes

Stop Typing

Actions Menu

Brain Panel

​Troubleshooting

​Requirements

​Related Features

Voice Agent

Voice Transcription

Action Menu

Brain Panel

Build docs developers (and LLMs) love

Overview

Global Voice Shortcuts

Primary Voice Controls

Built-in Voice Actions

System Commands

Desktop Automation Commands

Workflow Automation

Example Workflows

Memory-Based Commands

Creating Custom Commands

Custom Tool Template

Tool Execution Handler

Command Best Practices

Keyboard Shortcut Reference

Troubleshooting

Requirements

Related Features