The toon_content field contains the full competition context in a structured text format optimized for LLM consumption:
# COMPETITION: Spaceship Titanic## OverviewTitle: Spaceship TitanicCategory: FeaturedPrize: $25,000Evaluation: AccuracyTeams: 2500## DescriptionPredict which passengers are transported to an alternate dimension...## Dataset SchemaFile: train.csv - PassengerId (string) - HomePlanet (string) - CryoSleep (boolean) - Cabin (string) - Destination (string) - Age (float) - VIP (boolean) - Transported (boolean)## Top Notebooks### Notebook 1: Comprehensive EDA + Modeling (Author: topkaggler)Upvotes: 450#### Markdown Insights:- Feature engineering is crucial for this competition- CryoSleep correlates strongly with Transported- Group bookings (shared cabin prefix) have similar outcomes#### Code Highlights:```python# Feature engineeringdf['CabinDeck'] = df['Cabin'].str.split('/').str[0]df['CabinNum'] = df['Cabin'].str.split('/').str[1].astype(float)
…
## Status states in detail### 1. ProcessingReturned when:- First request for a new competition- Cached data is expired (>3 days old)- Previous fetch failed and is being retried**Response:**```json{ "slug": "competition-name", "title": "Fetching..." | "Updating cache..." | "Retrying...", "status": "processing", "message": "Fetching competition data. Refresh in 30-60 seconds."}
KaggleIngest uses per-slug locks to prevent duplicate background fetches:
INGESTION_LOCKS: Dict[str, asyncio.Lock] = {}async with INGESTION_LOCKS[slug]: # Only one task per slug can execute at a time # Double-check status inside lock to avoid redundant work
If multiple requests arrive simultaneously for the same competition:
First request inserts processing record and starts background fetch
Subsequent requests see processing status and return immediately
All requests wait for the same background task to complete