Test Data Management

Why Separate Test Data?

Separating test data from test logic offers several key benefits:

Maintainability

Update data in one place without touching test code

Reusability

Share data across multiple tests and test suites

Security

Keep sensitive credentials out of source code

Scalability

Easily add new test scenarios by adding data

Test Data Approaches

Our test suite demonstrates three common approaches to test data management:

JSON Files
Environment Variables
Hardcoded (Not Recommended)

Best for: Structured, static test data that’s reused across tests

testData/users.json

{
  "validUser": { "username": "standard_user", "password": "secret_sauce" },
  "invalidUser": { "username": "locked_out_user", "password": "secret_sauce" }
}

JSON files are ideal for maintaining multiple test scenarios with different data sets.

Best for: Sensitive data like credentials, API keys, or environment-specific URLs

.env

USERNAME=standard_user
PASSWORD=secret_sauce
BASE_URL=https://www.saucedemo.com

Always add .env to .gitignore to prevent committing sensitive data!

Avoid when possible: Data embedded directly in test code

def test_login(page):
    # Data hardcoded in test
    username = "standard_user"
    password = "secret_sauce"
    page.goto("https://www.saucedemo.com/")

Problems:

Hard to maintain
Data duplicated across tests
Can’t easily test different scenarios

JSON Test Data

Structure and Organization

Let’s examine our actual test data file:

testData/users.json

{
  "validUser": { "username": "standard_user", "password": "secret_sauce" },
  "invalidUser": { "username": "locked_out_user", "password": "secret_sauce" }
}

Data Structure

The JSON uses a keyed dictionary approach:

Top-level keys: validUser, invalidUser
Each key contains an object with related fields
Easy to access specific test scenarios: users["validUser"]

Benefits:

Self-documenting (key names explain the data)
Easy to extend with new user types
Type-safe access in tests

Alternative Structures

You could also organize data as arrays:

{
  "users": [
    {"type": "valid", "username": "standard_user", "password": "secret_sauce"},
    {"type": "invalid", "username": "locked_out_user", "password": "secret_sauce"}
  ]
}

Or grouped by test type:

{
  "login": {
    "valid": {"username": "standard_user", "password": "secret_sauce"},
    "invalid": {"username": "locked_out_user", "password": "secret_sauce"}
  },
  "products": [
    {"id": "sauce-labs-backpack", "name": "Sauce Labs Backpack"},
    {"id": "sauce-labs-bike-light", "name": "Sauce Labs Bike Light"}
  ]
}

Loading JSON Data with Fixtures

Here’s how our test suite loads JSON data:

tests/conftest.py

import json
from pathlib import Path
import pytest

@pytest.fixture(scope="session")
def users():
    """
    Loads testData/users.json and returns a dict.
    Accessible in any test as the 'users' fixture.
    """
    root = Path(__file__).parent.parent  # go from /tests to project root
    data_path = root / "testData" / "users.json"
    with data_path.open(encoding="utf-8") as f:
        return json.load(f)

Locate the Data File

root = Path(__file__).parent.parent  # Navigate to project root
data_path = root / "testData" / "users.json"

Uses Path for cross-platform compatibility

Open and Parse

with data_path.open(encoding="utf-8") as f:
    return json.load(f)

Reads file and parses JSON into Python dictionary

Use in Tests

def test_with_testdata(page, users):
    # users is automatically provided by pytest
    username = users["validUser"]["username"]
    password = users["validUser"]["password"]

Using JSON Data in Tests

Real example from our test suite:

tests/test_functionalities.py

from pages.login import LoginPage

def test_with_testdata(page, users):
    login_page = LoginPage(page)
    login_page.navigate()
    login_page.login(users["validUser"]["username"], users["validUser"]["password"])

    assert page.get_by_test_id("title").is_visible

How it works:

Test declares users parameter
pytest calls the users fixture from conftest.py:line_8
Fixture loads and parses testData/users.json:line_1
Test accesses nested data: users["validUser"]["username"]
Values passed to page object methods

The fixture uses scope="session" so the JSON file is only loaded once for all tests, improving performance.

Environment Variables

Setting Up Environment Variables

.env File
CI/CD Environment
Shell Export

Create a .env file in your project root:

.env

USERNAME=standard_user
PASSWORD=secret_sauce
BASE_URL=https://www.saucedemo.com
API_KEY=your_secret_api_key_here

Security Best Practice: Always add .env to your .gitignore:

.gitignore

.env
.env.local
*.env

In CI/CD pipelines, set environment variables through the platform:GitHub Actions:

.github/workflows/test.yml

env:
  USERNAME: ${{ secrets.TEST_USERNAME }}
  PASSWORD: ${{ secrets.TEST_PASSWORD }}

GitLab CI:

.gitlab-ci.yml

variables:
  USERNAME: "standard_user"
  PASSWORD: "secret_sauce"

For local testing without .env file:

export USERNAME="standard_user"
export PASSWORD="secret_sauce"
pytest tests/

Or inline:

USERNAME=standard_user PASSWORD=secret_sauce pytest tests/

Loading Environment Variables

Our conftest.py demonstrates best practices:

tests/conftest.py

import os
from pathlib import Path
import pytest
from dotenv import load_dotenv

# Load .env file at module level
load_dotenv(dotenv_path=Path(__file__).parent.parent / ".env")

@pytest.fixture(scope="session")
def creds():
    """
    Provides credentials from environment variables.
    Fails early with a clear message if missing.
    """
    user = os.getenv("USERNAME")
    pwd = os.getenv("PASSWORD")
    if not user or not pwd:
        raise RuntimeError("Missing USERNAME/PASSWORD in environment. ")
    return {"valid_user": user, "pwd": pwd}

Loading .env File

from dotenv import load_dotenv
load_dotenv(dotenv_path=Path(__file__).parent.parent / ".env")

Called at module level (runs when conftest.py is imported)
Loads variables from .env into os.environ
Explicit path ensures correct file is loaded

Accessing Variables

user = os.getenv("USERNAME")
pwd = os.getenv("PASSWORD")

os.getenv() returns None if variable not found
Better than os.environ["KEY"] which raises KeyError
Can provide default: os.getenv("API_URL", "https://api.example.com")

Error Handling

if not user or not pwd:
    raise RuntimeError("Missing USERNAME/PASSWORD in environment. ")

Why fail early?

Tests won’t run with invalid setup
Clear error message for developers
Prevents cryptic failures later in test execution

Return Structure

return {"valid_user": user, "pwd": pwd}

Returns a dictionary for consistent access pattern:

# In test
creds["valid_user"]  # Always works
creds["pwd"]         # Always works

Using Environment Variables in Tests

tests/test_functionalities.py

from pages.login import LoginPage

def test_login_with_env_vars(page, creds):
    login_page = LoginPage(page)
    login_page.navigate()
    login_page.login(creds["valid_user"], creds["pwd"])

    assert page.get_by_test_id("title").is_visible

Flow:

Test requests creds fixture
Fixture reads from environment variables
If missing, raises RuntimeError immediately
If found, returns dictionary with credentials
Test uses credentials without knowing their source

This approach allows the same test to work in different environments (local, CI/CD, staging) by simply changing environment variables.

Comparison: JSON vs Environment Variables

JSON Test Data
Environment Variables

def test_with_testdata(page, users):
    login_page = LoginPage(page)
    login_page.navigate()
    login_page.login(
        users["validUser"]["username"],
        users["validUser"]["password"]
    )
    assert page.get_by_test_id("title").is_visible

Pros:

Structured data with multiple scenarios
Easy to add new test cases
Version controlled (safe to commit)
Great for data-driven testing

Cons:

Not suitable for secrets
Less flexible across environments

def test_login_with_env_vars(page, creds):
    login_page = LoginPage(page)
    login_page.navigate()
    login_page.login(creds["valid_user"], creds["pwd"])
    assert page.get_by_test_id("title").is_visible

Pros:

Secure (not in source control)
Environment-specific configuration
Easy to change without code changes
Standard practice for secrets

Cons:

Harder to manage many values
Less structured than JSON
Requires setup in each environment

Advanced Patterns

Parametrized Tests with Data
Dynamic Product Data
Combining Both Approaches
Test Data Builders

Use @pytest.mark.parametrize for data-driven testing:

tests/test_login.py

@pytest.mark.parametrize("username, password",[
    ("error_user","secret_sauce"),
    ("performance_glitch_user","secret_sauce"),
    ("visual_user","secret_sauce")])

def test_multiple_users(page: Page, username, password):
    page.goto("https://www.saucedemo.com/")
    
    username_input = page.get_by_placeholder("Username")
    username_input.fill(username)

    password_input = page.get_by_placeholder("Password")
    password_input.fill(password)

    login_button = page.locator("input#login-button")
    login_button.click()

    assert page.get_by_test_id("title").is_visible

Test runs 3 times, once for each parameter set.

In our CartPage, product names are dynamic:

pages/cart_page.py

def add_product(self, product_name: str):
    add_button = self.page.locator(f"#add-to-cart-{product_name}")
    add_button.click()

def remove_product(self, product_name: str):
    remove_button = self.page.locator(f"#remove-{product_name}")
    remove_button.click()

Used in tests:

tests/test_functionalities.py

cart_page = CartPage(page)
cart_page.add_product("sauce-labs-backpack")
cart_page.add_product("sauce-labs-bike-light")

Could be improved with JSON data:

{
  "products": [
    {"id": "sauce-labs-backpack", "name": "Sauce Labs Backpack"},
    {"id": "sauce-labs-bike-light", "name": "Sauce Labs Bike Light"}
  ]
}

Use JSON for data, environment variables for config:

@pytest.fixture(scope="session")
def test_config():
    return {
        "base_url": os.getenv("BASE_URL", "https://www.saucedemo.com"),
        "timeout": int(os.getenv("TIMEOUT", "30000")),
        "headless": os.getenv("HEADLESS", "true").lower() == "true"
    }

@pytest.fixture(scope="session")
def test_users():
    data_path = Path(__file__).parent.parent / "testData" / "users.json"
    with data_path.open() as f:
        return json.load(f)

def test_configurable_login(page, test_config, test_users):
    page.goto(test_config["base_url"])
    # Use both fixtures together

Create helper functions to generate test data:

def build_user(username=None, password=None):
    return {
        "username": username or "standard_user",
        "password": password or "secret_sauce"
    }

def test_with_builder(page):
    # Use default values
    user = build_user()
    
    # Or override specific fields
    invalid_user = build_user(password="wrong")

Best Practices

Choose the Right Storage Method

Data Type	Storage Method
User credentials (test accounts)	JSON file (if not sensitive) or Environment Variables
API keys, tokens	Environment Variables only
Product catalogs	JSON file
Test URLs	Environment Variables (environment-specific)
Form field values	JSON file
Configuration flags	Environment Variables

Structure JSON Logically

// Good: Grouped by purpose
{
  "users": {
    "valid": {"username": "standard_user", "password": "secret_sauce"},
    "locked": {"username": "locked_out_user", "password": "secret_sauce"}
  },
  "products": {
    "backpack": {"id": "sauce-labs-backpack", "price": 29.99},
    "bikeLight": {"id": "sauce-labs-bike-light", "price": 9.99}
  }
}

// Avoid: Flat structure
{
  "validUsername": "standard_user",
  "validPassword": "secret_sauce",
  "backpackId": "sauce-labs-backpack",
  "backpackPrice": 29.99
}

Validate Data Early

@pytest.fixture(scope="session")
def users():
    root = Path(__file__).parent.parent
    data_path = root / "testData" / "users.json"
    
    if not data_path.exists():
        raise FileNotFoundError(f"Test data not found: {data_path}")
    
    with data_path.open(encoding="utf-8") as f:
        data = json.load(f)
    
    # Validate structure
    required_keys = ["validUser", "invalidUser"]
    for key in required_keys:
        if key not in data:
            raise ValueError(f"Missing required key in users.json: {key}")
    
    return data

Document Data Structure

Add comments to JSON (if using JSON5) or in fixture docstrings:

@pytest.fixture(scope="session")
def users():
    """
    Loads testData/users.json and returns a dict.
    
    Expected structure:
    {
        "validUser": {
            "username": str,
            "password": str
        },
        "invalidUser": {
            "username": str,
            "password": str
        }
    }
    """
    # implementation

Use Type Hints

from typing import Dict, Any

@pytest.fixture(scope="session")
def users() -> Dict[str, Dict[str, str]]:
    root = Path(__file__).parent.parent
    data_path = root / "testData" / "users.json"
    with data_path.open(encoding="utf-8") as f:
        return json.load(f)

Project Structure Example

source/
├── .env                      # Environment variables (DO NOT COMMIT)
├── .env.example              # Template for .env
├── .gitignore                # Include .env here
├── pages/
│   ├── login.py
│   └── cart_page.py
├── testData/
│   ├── users.json           # User test data
│   ├── products.json        # Product catalog
│   └── forms.json           # Form field data
└── tests/
    ├── conftest.py          # Fixtures for loading data
    ├── test_login.py
    └── test_functionalities.py

Keep test data files in a dedicated testData/ directory separate from test code for better organization.

Get Started

Core Concepts

Writing Tests

Advanced

Why Separate Test Data?

Maintainability

Reusability

Security

Scalability

Test Data Approaches

JSON Test Data

Structure and Organization

Loading JSON Data with Fixtures

Using JSON Data in Tests

Environment Variables

Setting Up Environment Variables

Loading Environment Variables

Using Environment Variables in Tests

Comparison: JSON vs Environment Variables

Advanced Patterns

Best Practices

Project Structure Example

Next Steps

Page Object Model

Fixtures

Build docs developers (and LLMs) love

Get Started

Core Concepts

Writing Tests

Advanced

​Why Separate Test Data?

Maintainability

Reusability

Security

Scalability

​Test Data Approaches

​JSON Test Data

​Structure and Organization

​Loading JSON Data with Fixtures

​Using JSON Data in Tests

​Environment Variables

​Setting Up Environment Variables

​Loading Environment Variables

​Using Environment Variables in Tests

​Comparison: JSON vs Environment Variables

​Advanced Patterns

​Best Practices

​Project Structure Example

​Next Steps

Page Object Model

Fixtures

Build docs developers (and LLMs) love

Why Separate Test Data?

Test Data Approaches

JSON Test Data

Structure and Organization

Loading JSON Data with Fixtures

Using JSON Data in Tests

Environment Variables

Setting Up Environment Variables

Loading Environment Variables

Using Environment Variables in Tests

Comparison: JSON vs Environment Variables

Advanced Patterns

Best Practices

Project Structure Example

Next Steps