Skip to main content

Overview

The List Splitter divides lists into batches based on configurable criteria: items per group, character limits, or target group count. Supports multiple input delimiters, deduplication modes, and output templates including SQL IN clauses, JSON arrays, and CSV.

Use Cases

  • Batch Processing: Split large lists for API rate limiting or batch operations
  • SQL Queries: Generate IN clause batches for database queries with parameter limits
  • Data Chunking: Divide datasets into manageable chunks for processing
  • Import Preparation: Split data for bulk import operations
  • Testing: Create test data batches of specific sizes
  • CSV Export: Convert lists to quoted CSV format

Input Format

Primary Input

List items (one per line, comma-separated, or tab-separated):
apple
banana
cherry
date
elderberry
fig
grape
Or comma-separated:
apple, banana, cherry, date, elderberry, fig, grape

Configuration (Second Input)

delimiter=newline
mode=items_per_group
value=3
dedupe=none
template=plain

Configuration Options

delimiter

  • newline (default): Split on line breaks
  • comma: Split on commas (handles newlines too)
  • tab: Split on tabs
  • auto: Auto-detect based on content

mode

  • items_per_group (default): Fixed number of items per batch
  • max_chars_per_group: Character limit per batch
  • target_group_count: Split into N equal groups

value

  • Number determining batch size/count based on mode
  • Default: 5
  • Range: 1 to unlimited

dedupe

  • none (default): No deduplication
  • case_sensitive: Remove exact duplicates
  • case_insensitive: Remove case-insensitive duplicates

template

  • plain (default): One item per line
  • sql_in: SQL IN clause format ('item1', 'item2', ...)
  • quoted_csv: Quoted CSV format "item1","item2",...
  • json_array: JSON array format ["item1", "item2", ...]

Output Format

Batches with headers and counts:
--- Batch 1 (3 items) ---
apple
banana
cherry

--- Batch 2 (3 items) ---
date
elderberry
fig

--- Batch 3 (1 items) ---
grape
Metadata:
7 items → 3 batches | Mode: items per group, Value: 3

Examples

apple
banana
cherry
date
elderberry
fig
grape
kiwi
user123
user456
user789
user012
user345
This is a short sentence.
Another brief statement.
A slightly longer sentence here.
Short.
Medium length sentence.
item1
item2
item3
item4
item5
item6
item7
item8
item9
item10
apple
banana
APPLE
cherry
Banana
date
apple
John Doe
Jane Smith
Bob O'Connor
Alice "Ace" Johnson

Template Formats

plain

item1
item2
item3

sql_in

('item1', 'item2', 'item3')
Perfect for SQL queries:
SELECT * FROM users WHERE username IN ('user1', 'user2', 'user3');

quoted_csv

"item1","item2","item3"
Handles commas, quotes, and special characters properly.

json_array

[
  "item1",
  "item2",
  "item3"
]
Formatted JSON for readability.

Implementation Details

From lib/tools/list-splitter.ts:123-139:
export function runSplitter(input: string, config: SplitterConfig): { output: string; meta: string } {
  const items = parseItems(input, config.delimiter);
  const { items: deduped, removed } = deduplicateItems(items, config.dedupe);
  const groups = splitItems(deduped, config.mode, config.value);

  const formatted = groups.map(g => {
    const header = `--- ${g.label} ---`;
    const body = formatBatch(g.items, config.template);
    return `${header}\n${body}`;
  }).join('\n\n');

  const parts = [`${deduped.length} items → ${groups.length} batch${groups.length !== 1 ? 'es' : ''}`];
  if (removed > 0) parts.push(`${removed} duplicates removed`);
  parts.push(`Mode: ${config.mode.replace(/_/g, ' ')}, Value: ${config.value}`);

  return { output: formatted, meta: parts.join(' | ') };
}
Key algorithms: Items Per Group (lib/tools/list-splitter.ts:48-55):
if (mode === 'items_per_group') {
  const chunks: string[][] = [];
  for (let i = 0; i < items.length; i += value) chunks.push(items.slice(i, i + value));
  return buildGroups(chunks);
}
Character Limit (lib/tools/list-splitter.ts:71-87):
const chunks: string[][] = [];
let current: string[] = [];
let len = 0;
for (const item of items) {
  const sep = current.length > 0 ? 1 : 0;
  if (current.length > 0 && len + sep + item.length > value) {
    chunks.push(current);
    current = [item];
    len = item.length;
  } else {
    current.push(item);
    len = current.length === 1 ? item.length : len + sep + item.length;
  }
}
if (current.length > 0) chunks.push(current);
return buildGroups(chunks);
Target Group Count (lib/tools/list-splitter.ts:57-69):
if (mode === 'target_group_count') {
  const count = Math.min(value, items.length);
  const base = Math.floor(items.length / count);
  const remainder = items.length % count;
  const chunks: string[][] = [];
  let cursor = 0;
  for (let i = 0; i < count; i++) {
    const size = base + (i < remainder ? 1 : 0);
    chunks.push(items.slice(cursor, cursor + size));
    cursor += size;
  }
  return buildGroups(chunks);
}
The List Splitter was extracted from SplitBox and adapted for Kayston’s Forge. It handles edge cases like empty items, uneven splits, and special characters in values.
For SQL IN clauses with large lists, use items_per_group mode with value 500-1000 to stay within database parameter limits (e.g., Oracle 1000, SQL Server unlimited but practical limits apply).
When using max_chars_per_group mode, the character count includes separators. Very long individual items may exceed the limit if they can’t be split.

Build docs developers (and LLMs) love