Overview
This page contains a complete, working example that demonstrates the entire workflow for retrieving data from the GSS StatsBank API, from initial setup through data retrieval and transformation.Complete Script
Here is the full code from start to finish. You can copy and paste this directly into your R script to retrieve the data.Expected Output
Step-by-Step Breakdown
Step 1: Load Required Packages
httr2: Handles HTTP requests and responsestidyverse: Provides data manipulation tools (includespurrr,readr,tidyr,dplyr)
Step 2: Define Helper Function
- Automates navigation through the API hierarchy
- Detects whether you’ve reached a table or folder
- Displays available options at each level
- Returns a reusable request object
Step 3: Connect to API and Navigate
- Establishes base API URL
- Navigates to the specific table: Water Disposal data from the 2021 Population and Housing Census
- Saves the request object for later use
Step 4: Construct Query
- Defines which data to retrieve
- Selects two water disposal categories
- Requests data for all geographic areas
- Specifies CSV format for response
Step 5: Fetch Data
- Attaches the query to the saved request
- Sends POST request to the API
- Retrieves raw response data
Step 6: Parse and Transform Data
- Extracts raw CSV bytes from response
- Parses CSV into a data frame
- Transforms from wide to tidy format
- Renames columns for easier analysis
resp_body_raw(): Gets raw byte contentread_csv(): Parses CSV into tibblepivot_longer(): Converts geographic areas from columns to rowspivot_wider(): Converts disposal types from rows to columnsrename(): Creates more convenient column names
Step 7: View Results
- Displays the first 10 rows of the cleaned dataset
- Confirms successful data retrieval and transformation
Key Concepts Demonstrated
1. API Navigation
The example shows how to:- Start with a base URL
- Navigate through database levels
- Reach a specific table
- Save the connection for reuse
2. Query Construction
Demonstrates:- Variable selection with specific values
- Using “item” filter for specific choices
- Using “all” filter for comprehensive data
- Specifying response format
3. Data Retrieval
Shows:- Attaching queries to requests
- Performing POST requests
- Handling API responses
4. Data Transformation
Illustrates:- Parsing CSV responses
- Converting wide to tidy format
- Creating analysis-ready data frames
- Renaming columns for clarity
Modifying the Example
Different Table
To query a different table, change the navigation path:Different Variables
Modify the query to select different values:Different Format
Request JSON instead of CSV:Exploratory Navigation
Before running the full script, you may want to explore available options:Discovering Variable Values
To see what values are available for a variable:Prerequisites
Ensure you have the required packages installed:Troubleshooting
Connection Errors
If you receive connection errors:- Check your internet connection
- Verify the API URL is correct and accessible
- Ensure no firewall is blocking HTTPS connections
Variable Not Found
If a variable code is not recognized:- Use
build_url()to verify available variables - Check spelling and capitalization (case-sensitive)
- Ensure you’re querying the correct table
Empty Results
If the query returns no data:- Verify variable values exist in the metadata
- Check filter type matches your intention
- Ensure all required variables are included in the query
Parse Errors
If data parsing fails:- Verify response format matches parsing method
- Check that CSV format was requested if using
read_csv() - Inspect raw response with
resp_body_string()
Next Steps
After successfully retrieving data:- Explore other tables in different databases
- Combine multiple queries to build comprehensive datasets
- Automate data updates by scheduling R scripts
- Create visualizations using
ggplot2or other tools - Build dashboards with Shiny or RMarkdown