Overview
The Notion Download module is a build-time script that fetches content from Notion databases and generates MDX files with YAML frontmatter. It implements incremental sync by comparing timestamps to avoid unnecessary re-downloads.
Main Function
downloadPostsAsMdx()
Fetches Notion posts and writes them as MDX files to the content directory.
async function downloadPostsAsMdx(
collection: 'projects' | 'blog'
): Promise<void[]>
collection
'projects' | 'blog'
required
The content collection to sync (determines which database to query)
Array of promises that resolve when files are written
Database Mapping:
'blog' → NOTION_BLOG_DB_ID
'projects' → NOTION_PROJECTS_DB_ID
Query Behavior:
The function queries Notion with these filters:
- Only pages where
public checkbox is true
- Sorted by
published date (descending)
Output Location:
src/content/blog/{page-id}.mdx
src/content/projects/{page-id}.mdx
Usage:
import { downloadPostsAsMdx } from '@lib/notion-download';
// Sync blog posts
await downloadPostsAsMdx('blog');
// Writing to file: src/content/blog/abc123.mdx
// Writing to file: src/content/blog/def456.mdx
// Sync projects
await downloadPostsAsMdx('projects');
Incremental Sync
shouldUpdateLocalFile()
Determines if a local MDX file needs to be updated based on lastEditedTime.
async function shouldUpdateLocalFile(
serverLastEditedTime: string,
srcContentPath: string,
postId: string
): Promise<boolean>
The last_edited_time from the Notion API
Content collection path (e.g., 'blog' or 'projects')
true if file should be updated, false if up-to-date
Logic:
- Reads the local MDX file (if exists)
- Extracts
lastEditedTime from frontmatter
- Compares server time vs. local time
- Returns
true if server is newer OR file doesn’t exist
Example:
// Local file frontmatter:
// ---
// lastEditedTime: '2025-01-01T10:00:00.000Z'
// ---
const shouldUpdate = await shouldUpdateLocalFile(
'2025-01-02T12:00:00.000Z', // Server is newer
'blog',
'abc123'
);
// Returns: true (file will be updated)
Frontmatter Generation
pagePropertiesToFrontmatter()
Converts Notion page properties to YAML frontmatter.
function pagePropertiesToFrontmatter(
pageProperties: any,
lastEditedTime?: string
): string
Object of page properties (from getPageProperties())
Optional timestamp to include in frontmatter
YAML frontmatter string with --- delimiters
Example:
const properties = {
title: 'My Blog Post',
published: '2025-01-01',
description: 'A great post',
tags: 'javascript,typescript',
public: 'true'
};
const frontmatter = pagePropertiesToFrontmatter(
properties,
'2025-01-01T10:00:00.000Z'
);
// Output:
// ---
// lastEditedTime: '2025-01-01T10:00:00.000Z'
// title: 'My Blog Post'
// published: '2025-01-01'
// description: 'A great post'
// tags: 'javascript,typescript'
// public: 'true'
// ---
Generated MDX Structure
Each MDX file follows this structure:
---
lastEditedTime: '2025-01-01T10:00:00.000Z'
title: 'My Blog Post'
published: '2025-01-01'
description: 'A great post about web development'
path: '/blog/my-post'
tags: 'javascript,typescript,webdev'
public: 'true'
---
import { Image } from 'astro:assets';
# My Blog Post
This is the first paragraph.
<Image src={import("@assets/file.abc123.png")} width="1200" height="800" format="webp" alt="Screenshot" />
## Subheading
- List item 1
- List item 2
Build Process
Integration with Astro
The download script runs before every build via scripts/index.ts:
// scripts/index.ts
import { downloadPostsAsMdx } from '../src/lib/notion-download';
await downloadPostsAsMdx('blog');
await downloadPostsAsMdx('projects');
Package.json scripts:
{
"scripts": {
"prebuild": "jiti scripts/index.ts",
"build": "astro build",
"predev": "jiti scripts/index.ts",
"dev": "astro dev"
}
}
This ensures content is always fresh before starting the dev server or building for production.
Parallel Processing
All posts are processed in parallel using Promise.all():
return Promise.all(
posts.map(async (post) => {
// Download and write each post
})
);
Skip Unchanged Files
Files are only written if:
- The file doesn’t exist locally, OR
- The server’s
lastEditedTime is newer than the local timestamp
This dramatically speeds up builds when content hasn’t changed.
Early Stream Termination
When reading local files, the stream is closed as soon as lastEditedTime is found:
const lineListener = (line) => {
if (line.includes('lastEditedTime')) {
lastEditedTime = line.substring(line.indexOf(': ') + 2);
rl.close(); // Stop reading
rl.removeListener('line', lineListener);
readStream.destroy();
}
};
Error Handling
File Not Found
If a local file doesn’t exist, shouldUpdateLocalFile() catches the error and returns true:
try {
const readStream = fs.createReadStream(dest);
// ...
} catch (err) {
// File probably doesn't exist, so we should fetch it
return true;
}
Invalid Collection
Invalid collection names throw an error:
if (collection === 'projects') {
databaseId = import.meta.env.NOTION_PROJECTS_DB_ID;
} else if (collection === 'blog') {
databaseId = import.meta.env.NOTION_BLOG_DB_ID;
} else {
throw Error('invalid collection');
}
Console Output
The module logs useful information during sync:
Fetching data_source_id for database: abc123...
Resolved data_source_id: xyz789... for database: abc123...
Fetching pages from Notion data source: xyz789...
Finished fetching pages from Notion data source: xyz789...
Downloading new asset: file.def456.png
Writing to file: src/content/blog/abc123.mdx
Writing to file: src/content/blog/ghi789.mdx
Environment Variables
Notion API integration token
Database ID for blog posts
File System Paths
Constants:
import { ASSET_SRC_PATH, ASSET_PUBLIC_PATH } from './constants';
// ASSET_SRC_PATH = 'src/assets/'
// ASSET_PUBLIC_PATH = 'public/assets/'
Output Paths:
const dest = path
.join('src', 'content', collection, post.id)
.concat('.mdx');
// Example: src/content/blog/abc123def456.mdx
Complete Example
Custom Sync Script
// scripts/sync-notion.ts
import { downloadPostsAsMdx } from '../src/lib/notion-download';
async function syncAll() {
console.log('Syncing blog posts...');
await downloadPostsAsMdx('blog');
console.log('Syncing projects...');
await downloadPostsAsMdx('projects');
console.log('Sync complete!');
}
syncAll().catch(console.error);
Run with:
jiti scripts/sync-notion.ts
Accessing Synced Content
// src/pages/blog/index.astro
import { getCollection } from 'astro:content';
const posts = await getCollection('blog');
const sortedPosts = posts.sort((a, b) =>
new Date(b.data.published).getTime() -
new Date(a.data.published).getTime()
);
Future Improvements
The code includes a TODO comment for potential optimization:
// TODO: optimize this for better build times -
// could store in json file every time we update
// instead of reading from file
Currently, lastEditedTime is read from each MDX file’s frontmatter. A future optimization could cache this in a JSON index file.
Source Reference
File: src/lib/notion-download.ts:1-131
Key Dependencies:
fs/promises - Async file system operations
readline - Stream-based file reading
path - File path utilities
./notion-cms - Database queries and block retrieval
./notion-cms-page - Page property extraction
./notion-parse - Block to Markdown conversion