Working with Data Types

Overview

This guide provides recommendations for working with common data types in Metadb. Following these practices ensures consistency, accuracy, and compatibility across your analytics environment.

Text and Character Data

Use TEXT for Strings

The TEXT data type is recommended for strings of characters without length constraints.

CREATE TABLE maria.books (
    title TEXT,
    author TEXT,
    description TEXT
);

PostgreSQL’s TEXT type has no performance penalty compared to VARCHAR and provides maximum flexibility.

VARCHAR for Legacy Compatibility

Use VARCHAR(n) only when working with older database clients that require explicit length specifications:

CREATE TABLE legacy_app.users (
    username VARCHAR(50),
    email VARCHAR(255)
);

Comparison

Type	Use Case	Length Limit
`TEXT`	Recommended for all new development	Unlimited (up to 1GB)
`VARCHAR(n)`	Legacy applications requiring fixed length	Specified by `n`
`VARCHAR`	Equivalent to `TEXT`	Unlimited
`CHAR(n)`	Fixed-length fields (avoid if possible)	Exactly `n` characters

TEXT, VARCHAR, and CHAR all use the same internal storage mechanism in PostgreSQL, so there’s no performance difference.

Numeric and Monetary Data

Monetary Amounts

For monetary values, use NUMERIC(19, 4) to avoid floating-point rounding errors:

CREATE TABLE finance.transactions (
    transaction_id UUID PRIMARY KEY,
    amount NUMERIC(19, 4),
    currency_code CHAR(3)
);

Why NUMERIC(19, 4)?

19 total digits (sufficient for large amounts)
4 decimal places (handles sub-cent precision)
Exact decimal representation (no rounding errors)

Exchange Rates

For currency exchange rates, use NUMERIC(19, 14) for higher precision:

CREATE TABLE finance.exchange_rates (
    from_currency CHAR(3),
    to_currency CHAR(3),
    rate NUMERIC(19, 14),
    effective_date DATE
);

Example: Calculating with Monetary Data

SELECT 
    transaction_id,
    amount,
    amount * 1.0825 AS amount_with_tax,
    ROUND(amount * 1.0825, 2) AS amount_with_tax_rounded
FROM finance.transactions;

Never use REAL, FLOAT, or DOUBLE PRECISION for monetary data. These floating-point types introduce rounding errors that are unacceptable for financial calculations.

Numeric Type Comparison

Type	Precision	Use Case
`NUMERIC(19, 4)`	Exact, 4 decimal places	Monetary amounts
`NUMERIC(19, 14)`	Exact, 14 decimal places	Exchange rates
`INTEGER`	Exact, no decimals	Counts, IDs
`BIGINT`	Exact, larger range	Large counts
`REAL`	Approximate, 6 decimal digits	❌ Avoid for money
`DOUBLE PRECISION`	Approximate, 15 decimal digits	❌ Avoid for money

Date and Time Data

Querying Date Ranges

The most reliable method for querying within date ranges uses the inequality pattern start <= t < end:

SELECT *
FROM folio_circulation.loan__t
WHERE '2023-01-01' <= loan_date AND loan_date < '2024-01-01';

Use inclusive start with <=

The start date is included: '2023-01-01' <= loan_date

Use exclusive end with <

The end date is excluded: loan_date < '2024-01-01'

Combine with AND

Both conditions together: WHERE start <= date AND date < end

Why this pattern?

Handles timestamps precisely without worrying about time components
Prevents off-by-one errors at boundaries
Works correctly with DATE and TIMESTAMP columns
Aligns with interval semantics [start, end)

Date Range Examples

-- All loans in 2023
SELECT *
FROM folio_circulation.loan__t
WHERE '2023-01-01' <= loan_date AND loan_date < '2024-01-01';

Avoid Common Date Pitfalls

Don’t use BETWEEN for date ranges with timestamps:

-- ❌ Problematic: Includes all of 2024-01-01
WHERE loan_date BETWEEN '2023-01-01' AND '2024-01-01'

-- ✅ Correct: Excludes 2024-01-01
WHERE '2023-01-01' <= loan_date AND loan_date < '2024-01-01'

BETWEEN is inclusive on both ends, which can include timestamps from the excluded day.

Date Type Recommendations

Type	Use Case	Example
`DATE`	Calendar dates without time	`2023-01-01`
`TIMESTAMP`	Date and time without timezone	`2023-01-01 14:30:00`
`TIMESTAMPTZ`	Date and time with timezone (recommended)	`2023-01-01 14:30:00+00`
`TIME`	Time of day without date	`14:30:00`
`INTERVAL`	Duration or time span	`'2 days 3 hours'`

Prefer TIMESTAMPTZ (timestamp with time zone) over TIMESTAMP for columns that track when events occurred. This ensures correct handling across time zones.

UUID (Universally Unique Identifiers)

Metadb uses UUIDs extensively for identifiers.

Working with UUIDs

-- UUIDs in WHERE clauses
SELECT *
FROM library.patron
WHERE id = 'a1b2c3d4-e5f6-4789-0abc-def123456789'::UUID;

-- Generating UUIDs
INSERT INTO maria.custom_items (id, name)
VALUES (gen_random_uuid(), 'New Item');

The __id column in Metadb tables is a BIGINT surrogate key, not a UUID. Don’t confuse it with id columns from source data, which may be UUIDs.

Boolean Data

PostgreSQL supports native boolean types:

-- Querying boolean columns
SELECT *
FROM library.patrongroup__
WHERE __current;  -- Equivalent to: WHERE __current = true

-- Boolean expressions
SELECT 
    id,
    groupname,
    __current AS is_current,
    NOT __current AS is_historical
FROM library.patrongroup__;

Boolean Values

Literal	Type	Notes
`TRUE`, `true`, `'t'`, `'true'`, `'y'`, `'yes'`, `'1'`	True
`FALSE`, `false`, `'f'`, `'false'`, `'n'`, `'no'`, `'0'`	False
`NULL`	Unknown	Three-valued logic

JSON and JSONB

Metadb often works with JSON data from source systems.

JSONB vs JSON

Always use JSONB (binary JSON) rather than JSON:

Feature	JSONB	JSON
Storage	Binary, decomposed	Text, exact copy
Indexing	Supported	Limited
Query performance	Fast	Slower
Whitespace preservation	No	Yes
Key order preservation	No	Yes

Use JSONB unless you specifically need to preserve exact formatting (rare in analytics).

Querying JSON Data

-- Extract text value
SELECT jsonb_extract_path_text(jsondata, 'groupname') AS groupname
FROM library.patrongroup;

-- Extract JSON object
SELECT jsondata -> 'address' AS address
FROM library.patron;

-- Extract and cast
SELECT (jsondata ->> 'age')::INTEGER AS age
FROM library.patron;

-- Query nested values
SELECT jsondata -> 'address' ->> 'city' AS city
FROM library.patron;

Consider Transformed Tables

Instead of repeatedly extracting JSON fields, use transformed tables for better performance and simpler queries.

Array Types

PostgreSQL supports array types for any data type:

CREATE TABLE maria.reading_lists (
    list_id INTEGER,
    patron_id UUID,
    book_ids UUID[],
    tags TEXT[]
);

-- Querying arrays
SELECT *
FROM maria.reading_lists
WHERE 'fiction' = ANY(tags);

-- Array aggregation
SELECT 
    patron_id,
    array_agg(book_id) AS all_books
FROM maria.reading_history
GROUP BY patron_id;

Best Practices Summary

Text Data

Use TEXT for strings; VARCHAR(n) only for legacy compatibility

Money

Use NUMERIC(19, 4) for amounts; NUMERIC(19, 14) for exchange rates

Dates

Use start <= date AND date < end pattern for ranges

JSON

Prefer JSONB over JSON; consider transformed tables

Never use floating-point types (REAL, FLOAT, DOUBLE PRECISION) for monetary data. Always use NUMERIC with appropriate precision.

Additional Resources

Querying Data Guide - Learn about table types and metadata columns
PostgreSQL Data Types Documentation - Complete PostgreSQL type reference

Get Started

Core Concepts

User Guide

Administration

Working with Data Types

Overview

Text and Character Data

Use TEXT for Strings

VARCHAR for Legacy Compatibility

Comparison

Numeric and Monetary Data

Monetary Amounts

Exchange Rates

Example: Calculating with Monetary Data

Numeric Type Comparison

Date and Time Data

Querying Date Ranges

Date Range Examples

Avoid Common Date Pitfalls

Date Type Recommendations

UUID (Universally Unique Identifiers)

Working with UUIDs

Boolean Data

Boolean Values

JSON and JSONB

JSONB vs JSON

Querying JSON Data

Consider Transformed Tables

Array Types

Best Practices Summary

Text Data

Money

Dates

JSON

Additional Resources

Build docs developers (and LLMs) love

Get Started

Core Concepts

User Guide

Administration

​Overview

​Text and Character Data

​Use TEXT for Strings

​VARCHAR for Legacy Compatibility

​Comparison

​Numeric and Monetary Data

​Monetary Amounts

​Exchange Rates

​Example: Calculating with Monetary Data

​Numeric Type Comparison

​Date and Time Data

​Querying Date Ranges

​Date Range Examples

​Avoid Common Date Pitfalls

​Date Type Recommendations

​UUID (Universally Unique Identifiers)

​Working with UUIDs

​Boolean Data

​Boolean Values

​JSON and JSONB

​JSONB vs JSON

​Querying JSON Data

​Consider Transformed Tables

​Array Types

​Best Practices Summary

Text Data

Money

Dates

JSON

​Additional Resources

Build docs developers (and LLMs) love

Overview

Text and Character Data

Use TEXT for Strings

VARCHAR for Legacy Compatibility

Comparison

Numeric and Monetary Data

Monetary Amounts

Exchange Rates

Example: Calculating with Monetary Data

Numeric Type Comparison

Date and Time Data

Querying Date Ranges

Date Range Examples

Avoid Common Date Pitfalls

Date Type Recommendations

UUID (Universally Unique Identifiers)

Working with UUIDs

Boolean Data

Boolean Values

JSON and JSONB

JSONB vs JSON

Querying JSON Data

Consider Transformed Tables

Array Types

Best Practices Summary

Additional Resources