Skip to main content

read_file

Read document content from a DOCX file or Google Doc. Output is token-limited to approximately 14k tokens by default and includes pagination metadata (has_more, next_offset). Use offset and limit to page through long documents.
The default toon format is compact and agent-friendly. It renders inline formatting as short tags — <b>bold</b>, <i>italic</i>, <u>underline</u>, <highlighting>highlighted</highlighting>, <a href="...">link</a> — so you can see both content and structure without parsing raw XML. Switch to json for structured output or simple for plain paragraph text.
Hints: readOnly — does not modify the document.
file_path
string
Path to the DOCX file. Provide either file_path or google_doc_id.
google_doc_id
string
Google Doc ID or URL. Extract the ID from the URL: docs.google.com/document/d/{ID}/edit.
offset
number
1-based paragraph offset for pagination. Negative values count from the end of the document.
limit
number
Maximum number of paragraphs to return. When omitted, output is token-limited to approximately 14k tokens and includes has_more and next_offset in the response.
node_ids
string[]
Fetch specific paragraphs by their _bk_* IDs instead of a contiguous range.
format
string
Output format. One of toon (default, compact with inline tags), json (structured), or simple (plain text, no tags).
show_formatting
boolean
When true (default), includes inline formatting tags (<b>, <i>, <u>, <highlighting>, <a>). Set to false to emit plain text with no inline tags.

Pagination example

When the response contains "has_more": true, pass the returned next_offset value as the offset parameter in your next call to continue reading.
{
  "has_more": true,
  "next_offset": 201
}

grep

Search paragraphs using regular expressions. Use file_path for session-based search, file_paths for stateless multi-file search, or google_doc_id for Google Docs. Hints: readOnly — does not modify any documents.
file_path
string
Path to the DOCX file for session-based search.
google_doc_id
string
Google Doc ID or URL. Extract the ID from the URL: docs.google.com/document/d/{ID}/edit.
file_paths
string[]
Multiple file paths for stateless multi-file search. No session is created when using this parameter.
pattern
string
A single regex pattern to search for.
patterns
string[]
Multiple regex patterns. Matches any paragraph that satisfies at least one pattern.
case_sensitive
boolean
When true, the search is case-sensitive. Defaults to false.
whole_word
boolean
When true, matches only whole words.
max_results
number
Maximum number of results to return.
context_chars
number
Number of surrounding characters to include in each result snippet.
dedupe_by_paragraph
boolean
When true, returns at most one result per paragraph even if multiple matches exist.
search_xml
boolean
When true, searches the raw XML (word/document.xml) instead of paragraph text. Useful for inspecting markup.
include_context
boolean
When false, skips document view context such as list labels and headers, giving faster results. Defaults to true.

has_tracked_changes

Check whether the document body contains tracked-change markers, including insertions, deletions, moves, and property-change records. Hints: readOnly — does not modify the document.
file_path
string
required
Path to the DOCX file.

get_file_status

Return file and session metadata for an open document, including edit count, normalization stats, and cache information. Hints: readOnly — does not modify any documents.
file_path
string
Path to the DOCX file. Provide either file_path or google_doc_id.
google_doc_id
string
Google Doc ID or URL. Extract the ID from the URL: docs.google.com/document/d/{ID}/edit.

Build docs developers (and LLMs) love