Skip to main content

Overview

HAI Build provides specialized commands for working with Jupyter notebooks in VS Code. Get AI assistance to generate, explain, and improve notebook cells while maintaining the context of your data science or analysis workflow.

Features

Generate Cells

Create new notebook cells with AI-generated code

Explain Cells

Get detailed explanations of what cells do

Improve Cells

Optimize and enhance existing cells

Setup

Ensure you have the necessary extensions:
1

Install Required Extensions

Make sure you have installed:
  1. HAI Build Code Generator (this extension)
  2. Jupyter Extension for VS Code (by Microsoft)
The Jupyter extension is typically installed automatically when you open a .ipynb file.
2

Open a Jupyter Notebook

Open any .ipynb file in VS Code. The notebook interface will activate with HAI Build commands available.
3

Verify HAI Build is Active

Look for HAI Build icons in:
  • Notebook toolbar (top of notebook)
  • Individual cell toolbars (when hovering over cells)

Generating Notebook Cells

Create new cells with AI-generated code based on your prompts.
1

Access Generate Command

Click the Generate Jupyter Cell icon (sparkle icon) in the notebook toolbar at the top.Or use Command Palette:
  1. Press Cmd+Shift+P (Mac) or Ctrl+Shift+P (Windows/Linux)
  2. Type Generate Jupyter Cell with HAI
  3. Press Enter
2

Enter Your Prompt

A prompt input box appears. Describe what you want the cell to do:
Load the CSV file 'sales_data.csv' and create a pandas DataFrame. 
Show the first 5 rows and basic statistics.
Press Enter to confirm or Esc to cancel.
3

Review Generated Cell

HAI Build:
  1. Analyzes your notebook context (existing cells, variables, imports)
  2. Generates appropriate code
  3. Inserts a new cell above or below the current cell
  4. Populates it with the generated code
The AI considers your existing notebook context, including imported libraries and defined variables.
4

Execute and Refine

  • Run the cell to test the generated code
  • Request improvements if needed
  • Iterate by generating additional cells

Generation Examples

Prompt:
Load the JSON file 'config.json' and extract the database connection settings
Generated Cell:
import json

with open('config.json', 'r') as f:
    config = json.load(f)

db_settings = config.get('database', {})
print(f"Database: {db_settings.get('host')}:{db_settings.get('port')}")

Explaining Notebook Cells

Get detailed explanations of what existing cells do.
1

Select a Cell

Click on the cell you want to understand.
2

Trigger Explain

Click the Explain Jupyter Cell icon (question mark icon) in the cell toolbar.Or use Command Palette:
  1. Press Cmd+Shift+P / Ctrl+Shift+P
  2. Type Explain Jupyter Cell with HAI
  3. Press Enter
3

View Explanation

The HAI Build chat opens with a detailed explanation including:
  • What the code does
  • How it works step-by-step
  • Purpose of key functions or methods
  • Expected inputs and outputs
  • Potential issues or considerations

Explanation Example

Cell Code:
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)
AI Explanation:
This cell performs dimensionality reduction using Principal Component Analysis (PCA):

1. **StandardScaler**: Standardizes features by removing the mean and scaling 
   to unit variance. This is crucial for PCA because it's sensitive to the 
   scale of features.

2. **fit_transform**: Fits the scaler on X and transforms it in one step, 
   resulting in X_scaled where each feature has mean=0 and std=1.

3. **PCA(n_components=2)**: Creates a PCA instance that will reduce the 
   dimensionality to 2 components (principal components).

4. **fit_transform on PCA**: Finds the principal components from X_scaled 
   and transforms the data to the new 2-dimensional space.

Result: X_pca contains the data projected onto 2 principal components, 
making it suitable for 2D visualization while preserving maximum variance.
Use Explain when reviewing notebooks from colleagues or learning new techniques.

Improving Notebook Cells

Optimize, enhance, or fix existing cells with AI assistance.
1

Select Cell to Improve

Click on the cell you want to enhance.
2

Trigger Improve Command

Click the Improve Jupyter Cell icon (lightbulb icon) in the cell toolbar.Or use Command Palette:
  1. Press Cmd+Shift+P / Ctrl+Shift+P
  2. Type Improve Jupyter Cell with HAI
  3. Press Enter
3

Describe Improvements

Enter what you want to improve:
Optimize this for better performance with large datasets
4

Review Improvements

HAI provides improved code in the chat. You can:
  • Review the suggested changes
  • Ask for different approaches
  • Accept and apply to your cell

Improvement Examples

Original Cell:
df = pd.read_csv('data.csv')
df['date'] = pd.to_datetime(df['date'])
result = df.groupby('category')['sales'].sum()
Improved Cell:
import os

# Check if file exists
if not os.path.exists('data.csv'):
    raise FileNotFoundError("data.csv not found")

try:
    df = pd.read_csv('data.csv')
    
    # Validate required columns
    required_cols = ['date', 'category', 'sales']
    if not all(col in df.columns for col in required_cols):
        raise ValueError(f"Missing required columns: {required_cols}")
    
    # Convert date with error handling
    df['date'] = pd.to_datetime(df['date'], errors='coerce')
    if df['date'].isna().any():
        print(f"Warning: {df['date'].isna().sum()} invalid dates found")
    
    # Group and sum
    result = df.groupby('category')['sales'].sum()
    
except Exception as e:
    print(f"Error processing data: {str(e)}")
    raise

Best Practices for Notebook AI Assistance

Help HAI understand your notebook’s context:Good prompts:
  • ✅ “Create a function to preprocess the text column removing special characters”
  • ✅ “Calculate correlation between features X1-X10 and the target variable”
  • ✅ “Generate a confusion matrix for the classifier stored in ‘model’ variable”
Vague prompts:
  • ❌ “Make a chart”
  • ❌ “Process the data”
  • ❌ “Add some analysis”
Keep your notebook well-structured:
  1. Imports at top: Generate import cells first
  2. Data loading: Load data before processing
  3. Exploration: Analysis cells in logical order
  4. Modeling: Train/test/evaluate sequentially
  5. Visualization: Display results clearly
This helps HAI understand the notebook flow.
HAI works better when cells are executed:
  • Execute cells to define variables and imports
  • HAI can reference executed variables in generations
  • Kernel state helps determine what’s available
Run your notebook sequentially before generating new cells for best results.
Refine generated cells through conversation:
Workflow
1. Generate initial cell
2. Run and observe results
3. Use "Improve" to refine
4. Repeat until satisfied
Example:
"Generate a bar chart of sales by region"
→ Review generated chart
→ "Make it horizontal and sort by value descending"
→ Review improvements
→ "Add data labels on each bar"
→ Final version
For complex notebook tasks:
  1. Use Jupyter commands for quick cell operations
  2. Use HAI chat for multi-cell workflows or complex refactoring
  3. Select multiple cells and add to HAI chat for broader context

Common Notebook Workflows

Data Science Pipeline

1

Data Loading

Prompt
Load the dataset from 'experiment_data.csv' and display basic info
2

Exploration

Prompt
Create a summary of missing values and data types for each column
3

Visualization

Prompt
Generate distribution plots for all numerical features in a grid layout
4

Preprocessing

Prompt
Create a preprocessing pipeline that handles missing values, 
encodes categorical variables, and scales numerical features
5

Modeling

Prompt
Train a logistic regression model with cross-validation 
and display accuracy scores

Data Analysis Report

1

Import Libraries

Prompt
Import pandas, numpy, matplotlib, and seaborn with standard aliases
2

Load Data

Prompt
Load the quarterly sales data and parse date columns
3

Calculate Metrics

Prompt
Calculate year-over-year growth rates for each product category
4

Visualize Trends

Prompt
Create a multi-line chart showing sales trends for top 5 products
5

Summary Statistics

Prompt
Generate a formatted summary table of key metrics by region

Tips for Effective Jupyter AI Usage

Specify Libraries

Mention preferred libraries in your prompts:“Use seaborn to create…”“With scikit-learn, implement…”

Request Comments

Ask for documented code:“Add comments explaining each step”“Include docstrings”

Define Variables

Reference existing variables:“Using the ‘df’ DataFrame…”“Apply this to ‘X_train’ and ‘y_train’”

Set Expectations

Be clear about outputs:“Return a pandas Series”“Display results in a table format”

Troubleshooting

Issue: HAI Build icons don’t appear in notebookSolutions:
  1. Ensure HAI Build extension is installed and enabled
  2. Verify Jupyter extension is active
  3. Reload VS Code window: Cmd+Shift+P → “Reload Window”
  4. Check notebook type is jupyter-notebook

Next Steps

Code Generation

Learn more about AI-powered code generation

Task Execution

Execute larger notebook development tasks

Settings

Configure LLM providers for notebook assistance

MCP Integration

Connect to data sources via Model Context Protocol
Data Science Tip: Use HAI Build to quickly prototype analysis workflows, then refine the code for production use. The AI excels at generating exploratory code and visualizations.

Build docs developers (and LLMs) love