The Document Download Frontend is a Flask-based web application that provides a secure user interface for downloading documents uploaded via the Document Download API . It implements a multi-step verification flow with robust security controls.
Application Structure
Flask Application Initialization
The application follows a modular Flask architecture with clear separation of concerns:
# application.py:20-29
application = Flask( "app" )
create_app(application)
application.wsgi_app = WhiteNoise(application.wsgi_app, STATIC_ROOT , STATIC_URL )
if using_eventlet:
application.wsgi_app = EventletTimeoutMiddleware(
application.wsgi_app,
timeout_seconds = int (os.getenv( "HTTP_SERVE_TIMEOUT_SECONDS" , 30 )),
)
The create_app() function (app/__init__.py:55-82) orchestrates initialization:
Configuration loading - Environment-based config (Development, Test, Production)
URL converters - Custom Base64UUIDConverter for service/document IDs
Middleware stack - Metrics, logging, request helpers
Blueprint registration - Main blueprint with route handlers
Error handlers - Unified error response handling
Core Components
WSGI Middleware
WhiteNoise : Static file serving with efficient caching
EventletTimeoutMiddleware : Request timeout protection (30s default)
GDSMetrics : Request/response metrics collection
Security Layer
CSP Headers : Strict Content Security Policy with nonce-based inline scripts
CSRF Protection : Flask-WTF CSRF tokens on forms
HTTP Security Headers : HSTS, X-Frame-Options, etc.
API Clients
ServiceApiClient : Fetches service metadata from Notify API
Document Download API : Metadata checks and authentication (via requests library)
Template System
Jinja2 : Template rendering with GOV.UK Frontend components
govuk-frontend-jinja : GOV.UK Design System integration
Request Flow Architecture
The document download process follows a secure multi-step flow:
1. Landing Page
# app/main/views/index.py:52-99
@main.route ( "/d/<base64_uuid:service_id>/<base64_uuid:document_id>" , methods = [ "GET" ])
def landing ( service_id , document_id ):
key = request.args.get( "key" , None )
if not key:
abort( 404 )
service = _get_service_or_raise_error(service_id)
metadata = _get_document_metadata(service_id, document_id, key)
if metadata.get( "confirm_email" , False ) is True :
continue_url = url_for( "main.confirm_email_address" , ... )
else :
continue_url = url_for( "main.download_document" , ... )
Purpose : Initial entry point that validates the document exists and determines if email verification is required.
Key validations :
Service ID and document ID must be valid UUIDs (base64-encoded)
Decryption key must be present in query string
Document metadata must be retrievable from API
Document must not be expired or deleted (410 Gone)
2. Email Verification (Conditional)
# app/main/views/index.py:102-182
@main.route ( "/d/<base64_uuid:service_id>/<base64_uuid:document_id>/confirm-email-address" )
def confirm_email_address ( service_id , document_id ):
form = EmailAddressForm()
if form.validate_on_submit():
authentication_data = _authenticate_access_to_document(
service_id, document_id, key, form.email_address.data
)
if authentication_data:
response = redirect(url_for( ".download_document" , ... ))
response.set_cookie(
key = "document_access_signed_data" ,
value = authentication_data[ "signed_data" ],
domain = cookie_domain,
httponly = True ,
secure = True
)
return response
Purpose : Verify the user’s email address matches the intended recipient.
Flow :
User enters email address
Frontend POSTs to Document Download API /authenticate endpoint
API validates email matches recipient
API returns signed authentication data
Frontend sets httponly cookie with signed data
Cookie is scoped to download URL path for minimal exposure
The authentication cookie is set by the frontend but read by the API. It works across subdomains by setting the domain attribute to the base domain (e.g., .gov.uk).
3. Download Page
# app/main/views/index.py:185-218
@main.route ( "/d/<base64_uuid:service_id>/<base64_uuid:document_id>/download" )
def download_document ( service_id , document_id ):
metadata = _get_document_metadata(service_id, document_id, key)
return render_template(
"views/download.html" ,
download_link = metadata[ "direct_file_url" ],
file_size = format_file_size(metadata[ "size_in_bytes" ]),
file_type = format_file_type(metadata[ "file_extension" ]),
file_expiry_date = _format_file_expiry_date(metadata[ "available_until" ])
)
Purpose : Display download button with file metadata.
Key features :
Direct download URL from Document Download API
Human-readable file size and type
Expiry date display (with day of week if within 30 days)
The actual download happens on the Document Download API (not this frontend)
Security Architecture
Content Security Policy
The application implements a strict CSP with nonce-based script execution:
# app/__init__.py:106-134
def make_nonce_before_request ():
if not getattr (request, "csp_nonce" , None ):
request.csp_nonce = secrets.token_urlsafe( 16 )
def useful_headers_after_request ( response ):
response.headers.add(
"Content-Security-Policy" ,
(
"default-src 'self';"
"script-src 'self' 'nonce- {csp_nonce} ';"
"connect-src 'self';"
"object-src 'self';"
"font-src 'self' data:;"
"img-src 'self' data:;"
"style-src 'self' 'nonce- {csp_nonce} ';"
"frame-ancestors 'self';"
"frame-src 'self';" .format( csp_nonce = request.csp_nonce)
),
)
Inline scripts and styles are forbidden. All dynamic scripts must use the CSP nonce from request.csp_nonce.
The application sets comprehensive security headers on every response (app/__init__.py:112-148):
Header Value Purpose X-Robots-Tagnoindex, nofollowPrevent search engine indexing X-Frame-OptionsDENYPrevent clickjacking X-Content-Type-OptionsnosniffPrevent MIME sniffing Referrer-Policyno-referrerDon’t leak URLs in referrer Cache-Controlno-store, no-cache, privatePrevent document URL caching Strict-Transport-Securitymax-age=31536000; includeSubDomainsForce HTTPS Cross-Origin-Embedder-Policyrequire-corpIsolate resources Cross-Origin-Opener-Policysame-originProcess isolation Permissions-PolicyRestrictive Disable browser features
CSRF Protection
All forms use Flask-WTF CSRF tokens:
# app/forms.py:47-53
class EmailAddressForm ( Form ):
email_address = EmailAddressField(
"Email address" ,
validators = [DataRequired( "Enter your email address" ), ValidEmail()],
filters = [strip_all_whitespace],
)
CSRF errors are caught and handled gracefully (app/__init__.py:175-179).
API Integration
Service API Client
Thread-safe client for fetching service metadata:
# app/notify_client/service_api_client.py:19-34
class ServiceApiClient :
def __init__ ( self , app ):
self .api_client = OnwardsRequestNotificationsAPIClient(
"x" * 100 ,
base_url = app.config[ "API_HOST_NAME" ],
)
self .api_client.service_id = app.config[ "ADMIN_CLIENT_USER_NAME" ]
self .api_client.api_key = app.config[ "ADMIN_CLIENT_SECRET" ]
def get_service ( self , service_id ):
return self .api_client.get( f "/service/ { service_id } " )
The client uses context variables (ContextVar) for thread-local storage, ensuring safety in concurrent request handling.
Document Download API Integration
The frontend communicates with two Document Download API endpoints:
1. Metadata Check (/services/{service_id}/documents/{document_id}/check)
# app/main/views/index.py:241-273
def _get_document_metadata ( service_id , document_id , key ):
check_file_url = " {} /services/ {} /documents/ {} /check?key= {} " .format(
current_app.config[ "DOCUMENT_DOWNLOAD_API_HOST_NAME_INTERNAL" ],
service_id, document_id, key
)
response = requests.get(check_file_url, headers = headers)
match response.status_code:
case 400 :
if "decryption key" in error_msg or "Forbidden" in error_msg:
abort( 404 )
case 404 | 403 :
abort( 404 )
case 410 :
abort( 410 )
Returns metadata including:
direct_file_url: Pre-signed download URL
size_in_bytes: File size
file_extension: File type
available_until: Expiry timestamp
confirm_email: Whether email verification is required
2. Email Authentication (/services/{service_id}/documents/{document_id}/authenticate)
# app/main/views/index.py:276-308
def _authenticate_access_to_document ( service_id , document_id , key , email_address ):
response = requests.post(
auth_file_url,
json = { "key" : key, "email_address" : email_address},
headers = headers,
)
if response.status_code == 429 :
raise TooManyRequests
elif response.status_code in { 400 , 403 }:
return None # Invalid email
data = response.json()
cookie_path = parse.urlsplit(data[ "direct_file_url" ]).path
return {
"signed_data" : data[ "signed_data" ],
"cookie_path" : cookie_path,
}
Returns:
signed_data: Cryptographically signed authentication token
direct_file_url: Used to extract cookie path scope
Middleware Stack
WhiteNoise (Static Files)
# application.py:23
application.wsgi_app = WhiteNoise(application.wsgi_app, STATIC_ROOT , STATIC_URL )
Serves compiled frontend assets (CSS, JS, images) with:
Efficient caching headers
Compression (gzip/brotli)
Fingerprinted URLs via asset_fingerprinter
EventletTimeoutMiddleware
# application.py:26-29
if using_eventlet:
application.wsgi_app = EventletTimeoutMiddleware(
application.wsgi_app,
timeout_seconds = int (os.getenv( "HTTP_SERVE_TIMEOUT_SECONDS" , 30 )),
)
Prevents long-running requests from blocking workers:
Configurable timeout (default 30 seconds)
Raises EventletTimeout exception on timeout
Custom error handler returns 504 Gateway Timeout (app/__init__.py:181-184)
EventletTimeout errors are displayed as generic 500 errors to users for security reasons.
GDSMetrics
# app/__init__.py:66
metrics.init_app(application)
Collects request/response metrics:
Request duration
Response status codes
Endpoint hit counts
Sends to StatsD for monitoring
Error Handling
Centralized error handling with user-friendly pages (app/__init__.py:151-184):
@application.errorhandler ( 410 )
@application.errorhandler ( 404 )
@application.errorhandler ( 403 )
@application.errorhandler ( 401 )
@application.errorhandler ( 400 )
def handle_http_error ( error ):
return _error_response(error.code)
@application.errorhandler ( 500 )
@application.errorhandler ( Exception )
def handle_bad_request ( error ):
current_app.logger.exception(error)
if current_app.config.get( "DEBUG" , None ):
raise error
return _error_response( 500 )
Special error handling for document-specific errors:
# app/main/views/index.py:66-74
try :
metadata = _get_document_metadata(service_id, document_id, key)
except (Gone, NotFound) as e:
return render_template(
"views/file-unavailable.html" ,
status_code = e.code,
service_name = service_name,
service_contact_info = service_contact_info,
), e.code
Document not found (404) and expired (410) errors show a custom template with service contact information to help users resolve issues.
Configuration Management
Environment-based configuration (app/config.py):
class Config :
# API endpoints
API_HOST_NAME = os.environ.get( "API_HOST_NAME" )
DOCUMENT_DOWNLOAD_API_HOST_NAME = os.environ.get( "DOCUMENT_DOWNLOAD_API_HOST_NAME" )
DOCUMENT_DOWNLOAD_API_HOST_NAME_INTERNAL = os.environ.get( "DOCUMENT_DOWNLOAD_API_HOST_NAME_INTERNAL" )
# Security
SECRET_KEY = os.environ.get( "SECRET_KEY" )
ADMIN_CLIENT_SECRET = os.environ.get( "ADMIN_CLIENT_SECRET" )
# Environment
NOTIFY_ENVIRONMENT = os.environ[ "NOTIFY_ENVIRONMENT" ]
HTTP_PROTOCOL = os.environ.get( "HTTP_PROTOCOL" , "http" )
class Development ( Config ):
DEBUG = True
SERVER_NAME = os.getenv( "SERVER_NAME" )
DOCUMENT_DOWNLOAD_API_HOST_NAME = "http://localhost:7000"
class Test ( Development ):
TESTING = True
WTF_CSRF_ENABLED = False
The application uses two Document Download API URLs: DOCUMENT_DOWNLOAD_API_HOST_NAME for redirects and DOCUMENT_DOWNLOAD_API_HOST_NAME_INTERNAL for backend API calls. This allows separate internal/external networking.
Sentry integration for error tracking and performance monitoring (app/performance.py:12-44):
def init_performance_monitoring ():
environment = os.getenv( "NOTIFY_ENVIRONMENT" ).lower()
sentry_enabled = bool ( int (os.getenv( "SENTRY_ENABLED" , "0" )))
if environment and sentry_enabled and sentry_dsn:
sentry_sdk.init(
dsn = sentry_dsn,
environment = environment,
sample_rate = error_sample_rate,
traces_sampler = traces_sampler,
)
Features:
Configurable error and trace sampling rates
PII control via environment variables
Git commit-based release tracking
Custom trace sampler that respects parent spans