Presentation Engine

The presentation engine is the heart of Handhold’s playback system. It synchronizes TTS audio, trigger execution, scene state transitions, and animations to create a cohesive learning experience.

Overview

The presentation engine coordinates:

Audio playback: TTS narration with word-level timing
Trigger execution: Commands fire at word positions
Scene state management: What’s visible on stage
Animation scheduling: Enter/exit effects for blocks
User controls: Play, pause, seek, skip, playback rate

Location: src/presentation/

Architecture

┌────────────────────────────────────┐
│       Presentation Component           │
└───────────────┬─────────────────────┘
                │
       ┌────────┼────────┐
       │                │
       ▼                ▼
┌───────────┐     ┌──────────────┐
│ Playback  │     │ State Store  │
│   Hook    │────▶│   (Zustand)   │
└──────┬────┘     └──────┬───────┘
       │                │
       ▼                ▼
┌───────────┐     ┌──────────────┐
│   TTS     │     │    Stage     │
│ (Howler) │     │ (Renderer)  │
└───────────┘     └──────────────┘

Presentation Store

File: src/presentation/store.ts A Zustand store holds all presentation state:

type PresentationState = {
  readonly lesson: ParsedLesson | null;
  readonly steps: readonly LessonStep[];
  readonly currentStepIndex: number;
  readonly status: 'idle' | 'playing' | 'paused';
  readonly playbackRate: number;
  readonly currentWordIndex: number;
  readonly sceneIndex: number;
  readonly completedStepIds: ReadonlySet<string>;
};

Key State Fields

lesson: The parsed lesson (from parser)
currentStepIndex: Current step (chapter) being played
status: Playback state (idle, playing, paused)
playbackRate: Speed multiplier (0.5x, 1x, 1.5x, 2x)
currentWordIndex: Current word in narration (synced to audio)
sceneIndex: Current scene (frame) in the step’s scene sequence
completedStepIds: Set of completed step IDs for progress tracking

Actions

type PresentationActions = {
  loadLesson: (opts: LoadLessonOpts) => void;
  play: () => void;
  pause: () => void;
  togglePlayPause: () => void;
  nextStep: () => void;
  prevStep: () => void;
  goToStep: (index: number) => void;
  setWordIndex: (index: number) => void;
  advanceScene: () => void;
  setSceneIndex: (index: number) => void;
  setPlaybackRate: (rate: number) => void;
  markStepComplete: (stepId: string) => void;
  reset: () => void;
};

Key actions:

loadLesson: Initializes the store with a parsed lesson
play / pause: Control playback
setWordIndex: Called by audio player to sync word position
advanceScene: Transition to next scene (triggered by word index)

Playback Orchestration

File: src/presentation/use-playback.ts The usePlayback hook manages audio playback and trigger execution.

Playback Flow

Load lesson

When a lesson is loaded, loadLesson() is called with the parsed lesson and options:

usePresentationStore.getState().loadLesson({
  lesson,
  initialStepIndex: 0,
  completedSlideIds: new Set(),
  onStepChange: (index) => {},
  onSlideComplete: (slideId) => {},
  onLessonComplete: () => {},
});

Prefetch TTS audio

React Query prefetches TTS audio for all narration blocks:

usePrefetchAllAudio(lesson);

This ensures smooth playback without waiting for synthesis.

User presses play

The play() action is dispatched:

usePresentationStore.getState().play();

Status changes to 'playing'.

Fetch current narration audio

The playback hook fetches TTS audio for the current step:

const { data } = useQuery({
  queryKey: ['tts', narrationText],
  queryFn: () => synthesize(narrationText),
});

Returns:

{
  audioBase64: string,
  wordTimings: Array<{
    word: string,
    wordIndex: number,
    startMs: number,
    endMs: number,
  }>,
}

Play audio

Audio is played using Howler.js:

const player = createAudioPlayer(audioBase64, {
  onPlay: () => {},
  onSeek: (position) => updateWordIndex(position),
});

player.play();

Track word position

A requestAnimationFrame loop polls the audio position and updates word index:

function updateWordIndex(positionMs: number) {
  const wordIndex = findWordAtPosition(positionMs, wordTimings);
  usePresentationStore.getState().setWordIndex(wordIndex);
}

Execute triggers

The event scheduler watches currentWordIndex and fires triggers:

useEffect(() => {
  const trigger = getNextTrigger(currentWordIndex);
  if (trigger) {
    executeTrigger(trigger);
  }
}, [currentWordIndex]);

Update scene state

Triggers update the scene state (visible blocks, focus, split mode, etc.):

function executeTrigger(trigger: TriggerVerb) {
  switch (trigger.verb) {
    case 'show':
      addBlockToStage(trigger.target);
      break;
    case 'hide':
      removeBlockFromStage(trigger.target);
      break;
    // ...
  }
}

Render stage

The Stage component reads scene state from the store and renders blocks:

<Stage>
  {sceneState.visible.map(blockName => (
    <VisualizationBlock key={blockName} name={blockName} />
  ))}
</Stage>

Narration ends

When audio completes, the playback hook auto-advances to the next step (if any):

player.on('end', () => {
  nextStep();
});

Event Scheduling

File: src/presentation/event-scheduler.ts Maps triggers to word positions and schedules execution.

Data Structure

type ScheduledEvent = {
  readonly wordIndex: number;
  readonly trigger: TriggerVerb;
};

Events are sorted by wordIndex.

Scheduling Algorithm

For each trigger in narration:
- Compute word position (sum of word counts before trigger)
- Store { wordIndex, trigger }
Sort events by word index
As playback progresses, check if current word index has passed any events
If yes, execute trigger

Example

Narration:

{{show: code-a}} Here is some code. {{focus: line-1}} This line is important.

Word timings:

"Here"
"is"
"some"
"code"
"This"
"line"
"is"
"important"

Scheduled events:

[
  { wordIndex: 0, trigger: { verb: 'show', target: 'code-a' } },
  { wordIndex: 4, trigger: { verb: 'focus', target: 'line-1' } },
]

As word index advances:

Word 0: Execute show: code-a
Word 4: Execute focus: line-1

Scene State Management

File: src/presentation/resolve-scene-at.ts Computes the scene state at a given word index.

Scene State Structure

type SceneState = {
  readonly visible: readonly string[];  // Block names
  readonly focused: string | null;  // Focused block/region
  readonly split: boolean;  // Split-screen mode
  readonly zoom: number;  // Zoom level
  readonly enterEffects: readonly SlotEnterEffect[];  // Animations
};

Resolving Scene State

Given a word index, replay all triggers up to that point:

function resolveSceneAt(step: LessonStep, wordIndex: number): SceneState {
  let state: SceneState = INITIAL_STATE;

  for (const event of getEventsUpTo(wordIndex)) {
    state = applyTrigger(state, event.trigger);
  }

  return state;
}

This enables seeking: jump to any word, resolve scene state, and render.

Animation System

File: src/presentation/animation-variants.ts Defines Motion animation variants for enter/exit effects.

Animation Variants

Each effect has enter/exit variants:

const fadeVariants = {
  enter: { opacity: 1, transition: { duration: 0.3 } },
  exit: { opacity: 0, transition: { duration: 0.2 } },
};

const slideVariants = {
  enter: { x: 0, transition: { type: 'spring', stiffness: 300 } },
  exit: { x: -100, transition: { duration: 0.2 } },
};

Custom Animations

Triggers can override default animations:

{{show: code-block slide 0.5s ease-out}}

Parsed to:

{
  verb: 'show',
  target: 'code-block',
  animation: {
    kind: 'custom',
    effect: 'slide',
    durationS: 0.5,
    easing: 'ease-out',
  },
}

The stage applies this animation:

<motion.div
  initial="exit"
  animate="enter"
  exit="exit"
  variants={getVariants(animation)}
>
  <VisualizationBlock />
</motion.div>

Stage Rendering

File: src/presentation/Stage.tsx The stage is the main rendering surface for visualizations.

Stage Layout

Default mode: Single block centered
Split mode: Multiple blocks side-by-side

Rendering Logic

export function Stage() {
  const sceneState = usePresentationStore((s) => getCurrentScene(s));

  return (
    <div className="stage">
      <AnimatePresence mode="popLayout">
        {sceneState.visible.map((blockName) => (
          <motion.div
            key={blockName}
            variants={getEnterEffect(blockName)}
            initial="exit"
            animate="enter"
            exit="exit"
          >
            <VisualizationBlock name={blockName} />
          </motion.div>
        ))}
      </AnimatePresence>
    </div>
  );
}

AnimatePresence: Handles enter/exit animations for blocks. VisualizationBlock: Renders the appropriate component (Code, Data, Diagram, etc.) based on block type.

User Controls

File: src/presentation/Controls.tsx Provides playback controls:

Play/Pause: Toggle playback
Skip forward/backward: Jump to next/previous step
Seek bar: Scrub through narration
Playback rate: 0.5x, 1x, 1.5x, 2x
Step navigation: Jump to specific step

Seek Behavior

When user seeks:

Pause audio
Compute word index for new position
Update currentWordIndex
Resolve scene state at new word index
Render new scene
Resume playback at new position

function handleSeek(positionMs: number) {
  const wordIndex = findWordAtPosition(positionMs, wordTimings);
  usePresentationStore.getState().setWordIndex(wordIndex);
  audioPlayer.seek(positionMs / 1000);
}

Narration Text Display

File: src/presentation/NarrationText.tsx Displays narration text with word-level highlighting.

Highlighting Current Word

export function NarrationText() {
  const narration = usePresentationStore((s) => getCurrentNarration(s));
  const currentWordIndex = usePresentationStore((s) => s.currentWordIndex);

  return (
    <div>
      {narration.words.map((word, index) => (
        <span
          key={index}
          className={index === currentWordIndex ? 'highlight' : ''}
        >
          {word}{' '}
        </span>
      ))}
    </div>
  );
}

The highlighted word updates in sync with audio.

TTS Integration

Location: src/tts/

Synthesis

File: src/tts/synthesize.ts Invokes backend TTS command:

export async function synthesize(text: string) {
  return await invoke<{
    audioBase64: string;
    wordTimings: WordTiming[];
  }>('synthesize', { text });
}

Audio Playback

File: src/tts/audio-player.ts Wraps Howler.js:

export function createAudioPlayer(audioBase64: string, callbacks: Callbacks) {
  const sound = new Howl({
    src: [`data:audio/wav;base64,${audioBase64}`],
    onplay: callbacks.onPlay,
    onpause: callbacks.onPause,
    onend: callbacks.onEnd,
  });

  return {
    play: () => sound.play(),
    pause: () => sound.pause(),
    seek: (positionS: number) => sound.seek(positionS),
    rate: (rate: number) => sound.rate(rate),
  };
}

Prefetching

File: src/tts/use-prefetch-tts.ts Prefetches TTS audio for all narration blocks using React Query:

export function usePrefetchAllAudio(lesson: ParsedLesson) {
  const queryClient = useQueryClient();

  useEffect(() => {
    lesson.steps.forEach((step) => {
      step.narration.forEach((block) => {
        queryClient.prefetchQuery({
          queryKey: ['tts', block.text],
          queryFn: () => synthesize(block.text),
        });
      });
    });
  }, [lesson]);
}

This ensures smooth playback without synthesis delays.

Performance Optimizations

React Query caching: TTS audio cached per narration text
Lazy rendering: Visualization blocks only render when visible
RequestAnimationFrame: Word position updates throttled to 60fps
Debounced seeks: Avoid rapid scene recomputation during scrubbing

Future Enhancements

Subtitle track: Display narration text with word highlighting (already implemented)
Keyboard shortcuts: Space to play/pause, arrow keys to skip
Chapter markers: Visual markers for each step on seek bar
Auto-advance: Option to auto-advance to next lesson
Accessibility: Keyboard navigation, screen reader support

Next Steps

Parser

Understand how markdown becomes IR

Frontend

Explore the React frontend

Backend

Dive into the Rust backend

Architecture

High-level system overview

Contributing

Architecture

Presentation Engine

Overview

Architecture

Presentation Store

Key State Fields

Actions

Playback Orchestration

Playback Flow

Event Scheduling

Data Structure

Scheduling Algorithm

Example

Scene State Management

Scene State Structure

Resolving Scene State

Animation System

Animation Variants

Custom Animations

Stage Rendering

Stage Layout

Rendering Logic

User Controls

Seek Behavior

Narration Text Display

Highlighting Current Word

TTS Integration

Synthesis

Audio Playback

Prefetching

Performance Optimizations

Future Enhancements

Next Steps

Parser

Frontend

Backend

Architecture

Build docs developers (and LLMs) love

Contributing

Architecture

​Overview

​Architecture

​Presentation Store

​Key State Fields

​Actions

​Playback Orchestration

​Playback Flow

​Event Scheduling

​Data Structure

​Scheduling Algorithm

​Example

​Scene State Management

​Scene State Structure

​Resolving Scene State

​Animation System

​Animation Variants

​Custom Animations

​Stage Rendering

​Stage Layout

​Rendering Logic

​User Controls

​Seek Behavior

​Narration Text Display

​Highlighting Current Word

​TTS Integration

​Synthesis

​Audio Playback

​Prefetching

​Performance Optimizations

​Future Enhancements

​Next Steps

Parser

Frontend

Backend

Architecture

Build docs developers (and LLMs) love

Overview

Architecture

Presentation Store

Key State Fields

Actions

Playback Orchestration

Playback Flow

Event Scheduling

Data Structure

Scheduling Algorithm

Example

Scene State Management

Scene State Structure

Resolving Scene State

Animation System

Animation Variants

Custom Animations

Stage Rendering

Stage Layout

Rendering Logic

User Controls

Seek Behavior

Narration Text Display

Highlighting Current Word

TTS Integration

Synthesis

Audio Playback

Prefetching

Performance Optimizations

Future Enhancements

Next Steps