Skip to main content

POST /memory/add_image

Add an image-based memory with optional text context. This endpoint is designed for screen capture and visual memory storage. The API can automatically classify the memory type based on the provided context.

Request Body

image_url
string
required
Base64-encoded image data URL or remote image URL. Used for storing visual context like screenshots.
user_id
string
required
Unique identifier for the user. All memories are associated with this user ID.
context
string
Optional text context describing the image. Used for classification and search relevance. For example: “Working on authentication bug in React component”.
metadata
object
Optional metadata to attach to the memory. Defaults include {"source": "screen_capture"}. You can add custom fields or manually specify memory_type.
auto_classify
boolean
default:"true"
Enable automatic memory type classification based on the context. When enabled and context is provided, the API uses GPT-4.1-nano to classify the memory.

Response

success
boolean
Indicates whether the memory was successfully added.
result
object
Details about the created memory.
classified_type
string
The automatically classified memory type if auto_classify was enabled. One of: LONG_TERM, SHORT_TERM, EPISODIC, SEMANTIC, or PROCEDURAL.

Example Request

curl -X POST http://localhost:8000/memory/add_image \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg...",
    "user_id": "user_123",
    "context": "Screenshot of VS Code showing authentication implementation",
    "metadata": {
      "source": "screen_capture",
      "timestamp": "2025-03-03T10:30:00Z"
    },
    "auto_classify": true
  }'

Example Response

{
  "success": true,
  "result": {
    "id": "mem_abc123",
    "memory": "User is working on authentication implementation in VS Code",
    "event": "ADD"
  },
  "classified_type": "SHORT_TERM"
}

Use Cases

Screen Capture

Automatically capture and store screenshots with visual context during coding sessions or interviews.

Visual Documentation

Store images of diagrams, whiteboards, or UI mockups with searchable context.

Bug Tracking

Capture error screenshots with context for future debugging reference.

Learning Journal

Save visual examples from tutorials or documentation with annotations.

Implementation Details

The endpoint processes images using GPT-4.1-nano with vision capabilities enabled. The image is analyzed alongside any provided text context to extract meaningful facts that can be searched later. Source: backend/main.py:263-305

Build docs developers (and LLMs) love