Navigating the GSS StatsBank API requires building URLs incrementally by inspecting the API at each level. This guide shows you how to create a helper function to streamline this process and efficiently find the data you need.
build_url = function(URL, ...) { path = list(...) req = request(URL) # Add each folder name to the URL path incrementally full_req = purrr::reduce(path, req_url_path_append, .init = req) # Check if the last part of the path ends in ".px" (indicating a table) is_table = FALSE if (length(path) > 0) { if (grepl("\\.px$", path[[length(path)]], ignore.case = TRUE)) { is_table = TRUE } } # Fetch the content to see what is inside (GET request) response = req_perform(full_req) body = resp_body_json(response) if (is_table) { # If it is a table, print the variable names (metadata) message("Endpoint reached: Table found.") message("Available variables:") print(map_chr(body$variables, "code")) } else { # If it is a folder/database, list the children IDs # (Note: The root level (databases) uses 'dbid', sub-levels use 'id') key = if(length(path) == 0) "dbid" else "id" print(map_chr(body, key)) } # Return the request object to be assigned to a variable return(full_req) }
The function automatically detects whether you’ve reached a table (files ending in .px) or are still in a folder level.
Navigate to a specific table and save the connection:
# This lists the variables inside the table and saves the connectiontable_req = build_url(URL, "PHC 2021 StatsBank", "Water and Sanitation", "waterDisposal_table.px")
Output:
[1] "WaterDisposal" "Locality" "Geographic_Area"
Notice we assigned the result to table_req. This saves the connection so you can use it later without retyping the entire URL path.
Now that you can navigate to specific tables, learn how to construct queries to retrieve the exact data you need.See the Constructing Queries guide to learn how to build JSON queries for data retrieval.