KaggleIngest uses standard HTTP status codes and structured error responses. This guide covers common errors and recommended handling strategies.
All errors follow FastAPI’s default format:
{
"detail" : "Human-readable error message"
}
Some errors may include additional context:
{
"detail" : "Signup failed: Email already registered"
}
Authentication errors
Missing API key
Status Code: 401 Unauthorized
Cause: Request missing the X-API-Key header
Response:
{
"detail" : "Missing X-API-Key header. Get your key at https://kaggleingest.com/dashboard"
}
Solution:
# ❌ Missing header
curl https://api.kaggleingest.com/competitions/titanic
# ✅ Include X-API-Key
curl https://api.kaggleingest.com/competitions/titanic \
-H "X-API-Key: ki_abc123xyz..."
Invalid API key
Status Code: 403 Forbidden
Cause: API key doesn’t exist in the database or has been revoked
Response:
{
"detail" : "Invalid API key"
}
Solutions:
Verify the API key is correct (check for typos)
Generate a new API key if the old one was lost
Ensure you’re using the production key for production API
import requests
resp = requests.get(
"https://api.kaggleingest.com/competitions/titanic" ,
headers = { "X-API-Key" : "ki_abc123xyz..." }
)
if resp.status_code == 401 :
print ( "Missing API key header" )
elif resp.status_code == 403 :
print ( "Invalid API key - check your credentials" )
elif resp.status_code == 200 :
data = resp.json()
print ( f "Success: { data[ 'title' ] } " )
Signup errors
Weak password
Status Code: 400 Bad Request
Cause: Password is shorter than 8 characters
Response:
{
"detail" : "Password must be at least 8 characters"
}
Solution:
# ❌ Weak password
curl -X POST https://api.kaggleingest.com/auth/signup \
-H "Content-Type: application/json" \
-d '{"email": "[email protected] ", "password": "weak"}'
# ✅ Strong password
curl -X POST https://api.kaggleingest.com/auth/signup \
-H "Content-Type: application/json" \
-d '{"email": "[email protected] ", "password": "SecurePass123!"}'
Email already registered
Status Code: 409 Conflict
Cause: Email already exists in the database
Response:
{
"detail" : "Email already registered"
}
Solutions:
Use a different email address
Reset your password if you forgot your API key
Contact support to recover your account
Status Code: 422 Unprocessable Entity
Cause: Email doesn’t match RFC 5322 format
Response:
{
"detail" : [
{
"type" : "value_error" ,
"loc" : [ "body" , "email" ],
"msg" : "value is not a valid email address" ,
"input" : "not-an-email"
}
]
}
Solution:
# ❌ Invalid email
curl -X POST https://api.kaggleingest.com/auth/signup \
-d '{"email": "not-an-email", "password": "SecurePass123!"}'
# ✅ Valid email
curl -X POST https://api.kaggleingest.com/auth/signup \
-d '{"email": "[email protected] ", "password": "SecurePass123!"}'
Competition errors
Invalid competition slug
Status Code: 400 Bad Request
Cause: Slug is empty or contains only whitespace
Response:
{
"detail" : "Invalid competition slug"
}
Solution:
# ❌ Empty slug
curl https://api.kaggleingest.com/competitions/ \
-H "X-API-Key: ki_abc123xyz..."
# ❌ Whitespace-only slug
curl https://api.kaggleingest.com/competitions/%20%20 \
-H "X-API-Key: ki_abc123xyz..."
# ✅ Valid slug
curl https://api.kaggleingest.com/competitions/titanic \
-H "X-API-Key: ki_abc123xyz..."
Competition not found
What happens: The API doesn’t return a 404—instead, it attempts to fetch from Kaggle and stores the result.
If the competition doesn’t exist on Kaggle:
First request returns status: "processing"
Background task fails to fetch from Kaggle API
Status updates to failed with error message in database
Next request automatically retries
Response (eventual failure):
{
"slug" : "non-existent-competition" ,
"title" : "Retrying..." ,
"status" : "processing" ,
"message" : "Retrying competition data fetch. Check back in 30-60 seconds."
}
The system will keep retrying on each request. If a competition consistently fails, it likely doesn’t exist on Kaggle or requires special credentials.
Processing timeout
Scenario: You request a competition, get processing status, but it never completes.
Possible causes:
Kaggle API is down or slow
Competition has many large notebooks (>10MB each)
Rate limiting from Kaggle side
Network connectivity issues
Recommended handling:
import time
import requests
api_key = "ki_abc123xyz..."
slug = "titanic"
max_retries = 10
retry_delay = 10 # seconds
for attempt in range (max_retries):
resp = requests.get(
f "https://api.kaggleingest.com/competitions/ { slug } " ,
headers = { "X-API-Key" : api_key}
)
data = resp.json()
if data[ "status" ] == "completed" :
print ( "Success!" )
break
elif data[ "status" ] == "processing" :
print ( f "Attempt { attempt + 1 } / { max_retries } : Still processing..." )
time.sleep(retry_delay)
else :
print ( f "Unexpected status: { data[ 'status' ] } " )
break
else :
print ( "Timed out waiting for competition data" )
Rate limiting
Status Code: 429 Too Many Requests
Cause: Exceeded the rate limit (default: 20 requests per minute)
Response:
{
"detail" : "Rate limit exceeded. Retry after 60 seconds."
}
Handling:
import requests
import time
def api_request_with_retry ( url , headers , max_retries = 3 ):
for attempt in range (max_retries):
resp = requests.get(url, headers = headers)
if resp.status_code == 429 :
retry_after = int (resp.headers.get( "Retry-After" , 60 ))
print ( f "Rate limited. Waiting { retry_after } seconds..." )
time.sleep(retry_after)
continue
return resp
raise Exception ( "Max retries exceeded" )
# Usage
resp = api_request_with_retry(
"https://api.kaggleingest.com/competitions/titanic" ,
{ "X-API-Key" : "ki_abc123xyz..." }
)
Rate limits are per IP address (using X-Forwarded-For header). If you’re behind a proxy or load balancer, ensure the header is set correctly.
Database errors
Connection pool exhausted
Status Code: 500 Internal Server Error
Cause: Too many concurrent database connections
Response:
{
"detail" : "Database pool not available in background task: ..."
}
Solution: This is a server-side issue. Contact support or wait a few minutes for the pool to recover.
Background task failures
If a background task fails during processing, the competition status is updated to failed with an error message stored in the database.
Next request behavior:
{
"slug" : "problematic-competition" ,
"title" : "Retrying..." ,
"status" : "processing" ,
"message" : "Retrying competition data fetch. Check back in 30-60 seconds."
}
The system automatically retries, but if failures persist:
Check if the competition exists on Kaggle
Verify Kaggle API credentials are valid (server-side)
Report the issue with the competition slug
Kaggle API errors
KaggleIngest depends on the Kaggle API. If Kaggle is down or returns errors:
class MetadataError ( KaggleIngestionError ):
"""Error fetching metadata."""
Causes:
Competition doesn’t exist
Private competition (requires team membership)
Kaggle API is temporarily unavailable
Notebook download errors
class NotebookDownloadError ( KaggleIngestionError ):
"""Error downloading notebooks."""
Causes:
Notebook was deleted or made private
Rate limiting from Kaggle
Network timeout
Behavior: The system skips failed notebooks and continues with the rest. Check the stats field (if available) for failed_notebooks.
URL parse errors
class URLParseError ( KaggleIngestionError ):
"""Error parsing Kaggle URL."""
Note: The API uses slugs directly, so URL parsing errors are rare. This exception is primarily for internal use.
Best practices for error handling
Check status codes first
Always check HTTP status codes before parsing JSON: if resp.status_code != 200 :
print ( f "Error { resp.status_code } : { resp.json()[ 'detail' ] } " )
Handle processing state gracefully
Implement polling with exponential backoff: retries = 0
while data[ "status" ] == "processing" and retries < 10 :
time.sleep( min ( 10 * ( 2 ** retries), 60 )) # Cap at 60 seconds
data = fetch_competition(slug)
retries += 1
Retry transient errors
Implement retry logic for 429, 500, 502, 503, 504 errors: transient_errors = [ 429 , 500 , 502 , 503 , 504 ]
if resp.status_code in transient_errors:
# Retry with exponential backoff
Log errors with context
Include the competition slug, API key prefix, and timestamp: logger.error( f "Failed to fetch { slug } : { resp.status_code } - { resp.text } " )
Fail fast on auth errors
Don’t retry 401/403 errors—fix the API key instead: if resp.status_code in [ 401 , 403 ]:
raise Exception ( "Invalid API key. Check your credentials." )
Error handling checklist
Next steps
API Reference View the complete API documentation
Authentication Learn about API key authentication