Skip to main content

Overview

Fenic provides two types of tools for MCP integration:
  1. UserDefinedTool - Created from parameterized DataFrame views using df.create_tool()
  2. SystemTool - Created by wrapping Python functions that return LogicalPlans
Both tool types can be registered with FenicMCPServer and exposed to LLM agents.

Tool Types

UserDefinedTool

A tool created from a parameterized DataFrame view. Parameters are defined using F.tool_param() in the DataFrame query.
from fenic.core.mcp.types import UserDefinedTool, BoundToolParam

# Created automatically via df.create_tool()
tool = df.create_tool(
    name="Get Sales by Region",
    description="Retrieve sales data filtered by region",
    params=[...],
    result_limit=100
)
Attributes:
name
str
required
Tool name displayed to MCP clients. Automatically converted to snake_case for the function name.
description
str
required
Description of what the tool does. Shown to LLM agents.
params
list[BoundToolParam]
required
List of bound parameters. Each parameter corresponds to a F.tool_param() usage in the DataFrame.
max_result_limit
int
required
Maximum number of rows to return. Automatically enforced even if agent requests more.
_parameterized_view
LogicalPlan
required
Internal logical plan with unresolved parameters. Bound at execution time.

SystemTool

A tool implemented as a regular Python function with explicit parameters. The function must return a LogicalPlan.
from fenic.core.mcp.types import SystemTool

tool = SystemTool(
    name="Get Top Products",
    description="Retrieve the top selling products",
    func=get_top_products,
    max_result_limit=100,
    add_limit_parameter=True,
    default_table_format="markdown",
    read_only=True,
    idempotent=True,
    destructive=False,
    open_world=False
)
Attributes:
name
str
required
Tool name displayed to MCP clients. Automatically converted to snake_case for the function name.
description
str
required
Description of what the tool does. Shown to LLM agents.
func
Callable[..., LogicalPlan]
required
Python function that returns a LogicalPlan. Function signature and annotations are preserved for schema generation.
max_result_limit
int
Maximum number of rows to return. If None, no limit is enforced.
add_limit_parameter
bool
default:"True"
Whether to expose a limit parameter to agents. Only applies if max_result_limit is set.
default_table_format
TableFormat
default:"markdown"
Default format for results. Options: "structured" (JSON) or "markdown" (table).
read_only
bool
default:"True"
Hint to MCP clients that the tool only reads data, doesn’t modify it.
idempotent
bool
default:"True"
Hint that repeated calls with same parameters produce the same result.
destructive
bool
default:"False"
Hint that the tool deletes or modifies data.
open_world
bool
default:"False"
Hint that the tool reaches out to external endpoints or knowledge bases.

Tool Parameters

ToolParam

Defines a parameter for a user-defined tool. Matched to F.tool_param() expressions in the DataFrame.
from fenic.core.mcp.types import ToolParam

param = ToolParam(
    name="region",
    description="Geographic region to filter by",
    allowed_values=["North", "South", "East", "West"],
    has_default=False,
    default_value=None
)
Attributes:
name
str
required
Parameter name. Must match the name used in F.tool_param().
description
str
required
Description of the parameter shown to agents.
allowed_values
list[ToolParameterType]
Constrain parameter to specific values. Creates a Literal type in the schema.Supported types: str, int, float, bool, list, dict
has_default
bool
default:"False"
Whether the parameter has a default value. Automatically set to True if default_value is provided.
default_value
ToolParameterType
Default value for the parameter. Makes the parameter optional.
Properties:
required
bool
Whether the parameter is required. Returns not has_default.

BoundToolParam

A parameter that has been bound to a specific typed usage within a DataFrame. Created automatically when calling df.create_tool().
from fenic.core.mcp.types import BoundToolParam
from fenic.core.types.datatypes import DataType

bound_param = BoundToolParam(
    name="region",
    description="Geographic region",
    data_type=DataType.STRING,
    required=True,
    has_default=False,
    default_value=None,
    allowed_values=["North", "South", "East", "West"]
)
Attributes:
name
str
required
Parameter name.
description
str
required
Parameter description.
data_type
DataType
required
Fenic data type inferred from the DataFrame context.
required
bool
required
Whether the parameter is required.
has_default
bool
required
Whether the parameter has a default value.
default_value
ToolParameterType
Default value for the parameter.
allowed_values
list[ToolParameterType]
Allowed values constraint.

Table Formats

All tools support two output formats:

Structured Format

Returns data as a list of JSON objects:
{
  "table_schema": [
    {"name": "product", "type": "String"},
    {"name": "sales", "type": "Int64"}
  ],
  "rows": [
    {"product": "Widget", "sales": 100},
    {"product": "Gadget", "sales": 150}
  ],
  "returned_result_count": 2,
  "total_result_count": 2
}

Markdown Format

Returns data as a markdown table:
| product | sales |
| --- | --- |
| Widget | 100 |
| Gadget | 150 |
Agents can specify the format using the table_format parameter (automatically added to all tools).

Creating Tools

User-Defined Tools

Create from parameterized DataFrame views:
from fenic import Session, F
from fenic.core.mcp.types import ToolParam

session = Session.get_or_create(config=...)
df = session.read.csv("sales.csv")

# Create parameterized view
filtered_df = df.filter(
    (F.col("region") == F.tool_param(
        "region",
        description="Sales region",
        allowed_values=["North", "South", "East", "West"]
    )) &
    (F.col("date") >= F.tool_param(
        "start_date",
        description="Start date (YYYY-MM-DD)"
    ))
)

# Create tool
sales_tool = filtered_df.create_tool(
    name="Get Sales by Region",
    description="Retrieve sales data filtered by region and date",
    params=[
        ToolParam(
            name="region",
            description="Sales region",
            allowed_values=["North", "South", "East", "West"]
        ),
        ToolParam(
            name="start_date",
            description="Start date (YYYY-MM-DD)",
            has_default=True,
            default_value="2024-01-01"
        ),
    ],
    result_limit=100
)

System Tools

Create from Python functions:
from fenic import Session, F
from fenic.core.mcp.types import SystemTool
from fenic.core._logical_plan import LogicalPlan
from typing import Optional

session = Session.get_or_create(config=...)

def get_customer_summary(min_orders: int = 1) -> LogicalPlan:
    """Get customer order summary.
    
    Args:
        min_orders: Minimum number of orders to include customer
        
    Returns:
        LogicalPlan with customer summary
    """
    df = session.read.csv("orders.csv")
    return (
        df.group_by("customer_id")
        .agg(
            F.count("*").alias("total_orders"),
            F.sum("amount").alias("total_spent")
        )
        .filter(F.col("total_orders") >= min_orders)
        .order_by(F.col("total_spent").desc())
        .plan
    )

# Wrap as SystemTool
customer_tool = SystemTool(
    name="Get Customer Summary",
    description="Retrieve customer order summaries with minimum order filter",
    func=get_customer_summary,
    max_result_limit=50,
    read_only=True,
    idempotent=True
)

System Tools with Complex Parameters

Use type annotations for rich parameter schemas:
from typing import List, Optional
from fenic.core.mcp.types import SystemTool
from fenic.core._logical_plan import LogicalPlan

def search_products(
    categories: List[str],
    min_price: Optional[float] = None,
    max_price: Optional[float] = None,
    in_stock: bool = True
) -> LogicalPlan:
    """Search products with filters.
    
    Args:
        categories: Product categories to search
        min_price: Minimum price filter
        max_price: Maximum price filter
        in_stock: Only show in-stock products
        
    Returns:
        LogicalPlan with filtered products
    """
    df = session.read.csv("products.csv")
    
    # Apply filters
    result = df.filter(F.col("category").isin(categories))
    
    if min_price is not None:
        result = result.filter(F.col("price") >= min_price)
    
    if max_price is not None:
        result = result.filter(F.col("price") <= max_price)
    
    if in_stock:
        result = result.filter(F.col("stock") > 0)
    
    return result.plan

# Create tool (parameters inferred from function signature)
product_tool = SystemTool(
    name="Search Products",
    description="Search products with category and price filters",
    func=search_products,
    max_result_limit=100
)

Parameter Validation

When creating user-defined tools, parameters are validated:
  1. All F.tool_param() usages must have corresponding ToolParam definitions
  2. Default values must be in allowed_values if both are specified
  3. All values in allowed_values must be the same type as default_value
  4. Multiple usages of the same parameter name must have identical configurations
Example Error:
# This will raise PlanError
tool = df.create_tool(
    name="Get Sales",
    description="...",
    params=[
        ToolParam(
            name="region",
            allowed_values=["North", "South"],
            default_value="West"  # ERROR: not in allowed_values
        )
    ],
    result_limit=100
)

Result Limiting

All tools enforce result limits:
# Tool with 100 row limit
tool = df.create_tool(
    name="Get Data",
    description="...",
    params=[],
    result_limit=100  # Max 100 rows
)

# Agent can request fewer rows
# tool(limit=50) -> returns 50 rows

# Agent cannot exceed tool limit
# tool(limit=200) -> returns 100 rows (capped by tool)
For system tools:
tool = SystemTool(
    name="Get Data",
    description="...",
    func=get_data,
    max_result_limit=100,       # Tool enforces max 100 rows
    add_limit_parameter=True    # Expose 'limit' param to agent
)
Set add_limit_parameter=False to hide the limit parameter from agents.

Type Inference

Fenic automatically infers Python types for tool parameters: From DataFrame Context:
# String parameter inferred from String column
df.filter(F.col("name") == F.tool_param("name_filter"))

# Integer parameter inferred from Int64 column
df.filter(F.col("age") > F.tool_param("min_age"))

# Array parameter inferred from Array column
df.filter(F.col("tags").array_contains(F.tool_param("tag")))
From Allowed Values:
# Literal["A", "B", "C"] inferred from allowed_values
ToolParam(
    name="category",
    allowed_values=["A", "B", "C"]
)
From Function Annotations:
def get_data(
    category: str,           # str parameter
    min_value: int,          # int parameter
    tags: List[str],         # list[str] parameter
    optional: Optional[str]  # Optional[str] parameter
) -> LogicalPlan:
    ...

Build docs developers (and LLMs) love