Overview
The Coding Practice feature provides LeetCode and StrataScratch-style Data Science coding challenges. Practice SQL and Pandas problems with realistic datasets, get instant AI evaluation, and learn from model answers in both languages.Getting Started
Select Difficulty Level
Choose from Easy, Medium, or Hard problems. Start with Easy to build fundamentals before tackling harder challenges.
Pick Question Count
Generate 1-5 problems per practice session. For focused learning, stick to 2-3 problems at a time.
Read Problem Statement
Each problem includes:
- Clear business problem description
- Table schemas with column names and types
- Sample data (4-5 rows per table)
- Expected output columns
Choose Your Language
Solve using either SQL or Python (Pandas). You can switch languages and compare approaches.
Question Generation
Problems are dynamically generated by AI to ensure variety:What Makes Questions Realistic
Based on Real Interview Platforms:- Mirrors LeetCode Database problems
- Inspired by StrataScratch data science challenges
- Reflects actual technical screening questions from top tech companies
Concepts Tested
SQL Skills:- Window Functions: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
- Joins: INNER, LEFT, RIGHT, FULL OUTER, CROSS, SELF
- Aggregations: GROUP BY, HAVING, COUNT(), SUM(), AVG(), MAX(), MIN()
- CTEs: WITH clauses for complex multi-step queries
- Subqueries: Correlated and non-correlated
- String/Date Functions: SUBSTRING(), CONCAT(), DATE_DIFF(), etc.
- Data Manipulation: groupby(), merge(), apply(), transform()
- Filtering: loc[], iloc[], boolean indexing, query()
- Aggregation: agg(), pivot_table(), crosstab()
- Window Operations: rolling(), expanding(), rank()
- String Operations: str.contains(), str.split(), str.replace()
- DateTime: to_datetime(), dt accessor methods
Problems are designed to be solvable in both SQL and Pandas. This helps you understand how database operations translate to DataFrame manipulations.
Difficulty Levels
Easy Problems
Concepts: Basic SELECT, simple WHERE clauses, single aggregations Example: “Find all employees with salary > 50000. Return name and salary.” SQL: Simple SELECT with WHERE Pandas:df[df['salary'] > 50000][['name', 'salary']]
Medium Problems
Concepts: JOINs, GROUP BY with HAVING, window functions, CTEs Example: “Find departments with average salary > 75000. Return department name and average salary, sorted by avg salary descending.” SQL: JOIN + GROUP BY + HAVING + ORDER BY Pandas: merge() + groupby() + filter() + sort_values()Hard Problems
Concepts: Complex window functions, self-joins, multiple CTEs, advanced aggregations Example: “Find employees whose salary is in the top 10% within their department. For each, show their salary percentile rank.” SQL: Window functions with PERCENT_RANK(), partitioning, CTEs Pandas: groupby() + transform() + quantile() + complex filteringAI Evaluation System
Your code is evaluated by Mistral AI against comprehensive test cases:Evaluation Criteria
1. Correctness (60%)- Does the query return the correct results?
- Are edge cases handled (nulls, duplicates, empty results)?
- Is the output structure correct (column names, types)?
- Is the solution efficient?
- Are best practices followed (proper JOINs vs subqueries, avoiding SELECT *)?
- Is the code readable and well-structured?
- Null values handled appropriately?
- Duplicate records managed correctly (DISTINCT, RANK vs DENSE_RANK)?
- Empty table scenarios considered?
Test Cases
Each submission is tested against:Scoring
- Passed: All test cases passed (score 85-100)
- Partial: Some test cases passed (score 40-84)
- Failed: Most/all test cases failed (score 0-39)
Detailed Feedback
Every submission receives:1. Pass/Fail Status
Clear verdict: ”✅ Passed all test cases” or ”❌ Failed 2/3 test cases”2. Specific Feedback (2-3 sentences)
Not generic - tells you EXACTLY what’s wrong: Good Feedback Examples:- “You forgot to handle ties by using DENSE_RANK() instead of RANK(). This causes gaps in ranking when salaries are equal.”
- “Your Pandas solution doesn’t handle null department_ids. Use dropna() or fillna() before the merge.”
- “The query is correct but inefficient. Use a window function instead of a correlated subquery for better performance.”
- Generic: “Try harder”
- Vague: “Something is wrong with your query”
- Unhelpful: “Read the problem again”
3. Test Case Breakdown
See which specific cases passed/failed:4. Model Answers (Both Languages!)
Regardless of which language you used, you get optimal solutions in BOTH SQL and Pandas: SQL Model Answer:Studying both solutions helps you understand the conceptual mapping between SQL operations and Pandas methods. This is crucial for data engineering roles.
Example Workflow
Generate Problems
Request 3 medium difficulty problems. System generates:
- “Most Profitable Companies” (aggregation + ORDER BY)
- “Employee Salary Ranking” (window functions)
- “Department Average Above Company Average” (subqueries + JOIN)
Attempt Problem 1 (SQL)
Problem: Find 3 most profitable companiesYour SQL:Result: ✅ Passed (Score: 92/100)Feedback: “Correct solution! Minor note: consider using TOP(3) for SQL Server compatibility instead of LIMIT.”
Attempt Problem 2 (SQL)
Problem: Rank employees by salary within each departmentYour SQL (first attempt):Result: ❌ Failed 1/3 tests (Score: 58/100)Feedback: “You used RANK() which creates gaps for ties (1, 2, 2, 4). The problem expects DENSE_RANK() for consecutive ranking (1, 2, 2, 3).”
Practice Strategies
For SQL Mastery
Beginner Level (Weeks 1-2)- Focus on: SELECT, WHERE, ORDER BY, basic JOINs
- Do 5 easy problems daily
- Goal: 90%+ pass rate on easy problems
- Focus on: Window functions, GROUP BY + HAVING, CTEs
- Do 3 medium problems daily
- Compare your solution to model answer efficiency
- Goal: Understand RANK vs DENSE_RANK vs ROW_NUMBER
- Focus on: Complex window functions, self-joins, query optimization
- Do 2 hard problems daily
- Try solving same problem multiple ways
- Goal: Write production-quality, optimized queries
For Pandas Mastery
Beginner Level- Master: boolean indexing, loc/iloc, basic groupby()
- Translate easy SQL problems to Pandas
- Goal: Understand DataFrame as in-memory SQL table
- Master: merge operations, transform(), apply()
- Practice: multi-level groupby, pivot tables
- Goal: Solve without looking at SQL first
- Master: Custom aggregations, method chaining, performance optimization
- Challenge: Solve hard SQL problems in Pandas
- Goal: Know when Pandas is better than SQL (and vice versa)
Cross-Training Strategy
Workflow:- Solve in your primary language (SQL or Pandas)
- Get it passing all test cases
- Look at model answer in OTHER language
- Re-solve using the other language
- Compare execution logic
SELECT cols FROM table WHERE condition↔df[condition][cols]GROUP BY col HAVING aggregate > X↔df.groupby('col').filter(lambda x: x.agg > X)JOIN table2 ON table1.id = table2.id↔df1.merge(df2, on='id')RANK() OVER (PARTITION BY col ORDER BY val)↔df.groupby('col')['val'].rank()
Common Mistakes
SQL Mistakes
1. RANK vs DENSE_RANKPandas Mistakes
1. Chained Indexing (SettingWithCopyWarning)Technical Details
AI Model
- Engine: Mistral Large (via Mistral API)
- Evaluation: Simulated execution with test case generation
- Timeout: 30 seconds per evaluation
- Fallback: If API fails, provides basic syntax feedback
Code Execution
Important: The system does NOT execute your code directly. Instead:- AI analyzes your code syntax and logic
- Simulates execution against generated test cases
- Predicts output based on code understanding
- Compares predicted output to expected results
- Very novel syntax might confuse the AI
- Stick to standard SQL/Pandas idioms for best results
Problem Generation
- Prompt Engineering: Detailed prompts ensure realistic schemas and sample data
- Validation: Generated problems checked for valid JSON structure
- Fallback Problems: Pre-written problems used if generation fails
- Variety: Session seed ensures different problems each time
Performance
- Question generation: ~3-5 seconds for 3 problems
- Code evaluation: ~2-4 seconds per submission
- Model answer generation: Included in evaluation response