Skip to main content

Bootcamp Fundamentos de Ciencia de Datos

Welcome to the Data Science Fundamentals Bootcamp! This comprehensive program takes you from programming basics through advanced machine learning and deep learning, covering all essential skills for a data science career.

Program Structure

The bootcamp is organized into 8 specialized modules (A1-A8), each building on previous knowledge to create a complete learning path. The program emphasizes practical, hands-on projects that mirror real-world data science workflows.
All modules include demos, exercises, and capstone projects to ensure you gain practical experience with each concept.

Module Breakdown

Module A1: Professional Profile and Methodology

Focus: Introduction to the data science profession and bootcamp methodology
  • Understanding the data science professional profile
  • Career paths and opportunities in data science
  • Bootcamp structure and learning methodology
  • Digital sobriety and responsible technology use
  • Setting up your professional development plan
Key Deliverables: Professional profile analysis, methodology understanding

Module A2: Python Fundamentals

Focus: Core Python programming for data science
  • Python installation and environment setup (venv/conda)
  • Python syntax, data types, and control structures
  • Functions, modules, and code organization
  • Object-oriented programming (OOP) concepts
  • File handling and JSON data persistence
  • Best practices: docstrings, naming conventions, code structure
Project: Contact Management System - A command-line application using OOP, data structures, and JSON persistence for managing client contacts with full CRUD operations and unit testing.

Module A3: Data Preparation with NumPy and Pandas

Focus: Data manipulation and preparation fundamentals
  • NumPy arrays and vectorized operations
  • Pandas DataFrames and Series
  • Data loading from multiple sources (CSV, Excel, web scraping)
  • Data cleaning: handling missing values, duplicates, outliers
  • Data transformation and feature engineering
  • Data consolidation and export
Project: E-commerce Data Preparation Pipeline - Complete data preparation workflow consolidating data from multiple sources, cleaning, transforming, and preparing datasets for analysis with optional Streamlit dashboard.

Module A4: Exploratory Data Analysis (EDA)

Focus: Statistical analysis and data visualization for business decisions
  • Descriptive statistics and data profiling
  • Data visualization with Matplotlib and Seaborn
  • Distribution analysis and pattern detection
  • Correlation analysis and feature relationships
  • Interactive dashboards with Streamlit
  • Communicating insights through visualizations
Project: ComercioYA Sales Analysis - Complete EDA on historical e-commerce sales data to support business decisions, including statistical analysis, visualizations, and an interactive Streamlit dashboard.

Module A5: Statistical Inference

Focus: Probability theory and hypothesis testing for data-driven decisions
  • Probability foundations and distributions
  • Random variables and probability calculations
  • Normal, binomial, and Poisson distributions
  • Central Limit Theorem and sampling distributions
  • Confidence intervals for population parameters
  • Hypothesis testing (t-tests, proportion tests)
  • Type I and Type II errors
  • P-values and statistical significance
Project: University Student Health Habits Study - Statistical inference project analyzing relationships between sleep, nutrition, physical activity, and academic performance using simulated data, hypothesis testing, and confidence intervals.

Module A6: Supervised Machine Learning - Regression

Focus: Building predictive models for continuous outcomes
  • Linear regression fundamentals
  • Polynomial regression and feature engineering
  • Regularization techniques (Ridge, Lasso, ElasticNet)
  • Model evaluation metrics (MAE, MSE, RMSE, R²)
  • Cross-validation and hyperparameter tuning with GridSearchCV
  • Ensemble methods: Gradient Boosting
  • Model selection and comparison
  • Production deployment considerations
Project: E-commerce Sales Prediction - Regression model predicting total sales per order using customer, product, and logistics data to support personalized campaigns, inventory management, and revenue forecasting.

Module A7: Unsupervised Machine Learning - Clustering

Focus: Pattern discovery and customer segmentation
  • Dimensionality reduction: PCA (Principal Component Analysis)
  • Non-linear dimensionality reduction: t-SNE
  • K-Means clustering and the elbow method
  • DBSCAN for density-based clustering
  • Hierarchical clustering and dendrograms
  • Silhouette score and cluster evaluation
  • Business interpretation of clusters
  • Outlier detection and handling
Project: Retail Customer Segmentation - Intelligent customer segmentation system using unsupervised learning to identify behavioral patterns, enabling personalized marketing campaigns and customer retention strategies.

Module A8: Deep Learning with Keras and PyTorch

Focus: Neural networks and deep learning frameworks
  • Introduction to neural networks and deep learning
  • Building models with Keras (TensorFlow backend)
  • Building models with PyTorch
  • Convolutional Neural Networks (CNNs) for image data
  • Recurrent Neural Networks (RNNs) for sequential data
  • Model training, validation, and optimization
  • Transfer learning and pre-trained models
  • Comparing Keras and PyTorch workflows
Project: Deep Learning implementation using both Keras and PyTorch frameworks, comparing approaches and architectures for real-world data problems.

Learning Approach

Hands-On Practice

Every module includes:
  • Live coding sessions with instructors
  • Demos showing real implementations
  • Exercises for immediate practice
  • Projects simulating real-world scenarios

Project-Based Learning

Each module culminates in a comprehensive project that:
  • Applies all concepts learned in the module
  • Uses real or realistic datasets
  • Follows industry best practices
  • Includes documentation and code quality standards
  • Can be included in your professional portfolio

Progressive Complexity

The curriculum is designed to build knowledge progressively:
  1. Foundation (A1-A2): Professional context and programming basics
  2. Data Skills (A3-A4): Data manipulation and exploration
  3. Statistical Thinking (A5): Inference and hypothesis testing
  4. Machine Learning (A6-A7): Supervised and unsupervised learning
  5. Advanced AI (A8): Deep learning with modern frameworks

Technical Stack

Throughout the bootcamp, you’ll work with:
  • Python 3.8+ - Core programming language
  • Jupyter Notebooks - Interactive development environment
  • NumPy - Numerical computing
  • Pandas - Data manipulation and analysis
  • Matplotlib & Seaborn - Data visualization
  • Scikit-learn - Machine learning algorithms
  • TensorFlow/Keras - Deep learning framework
  • PyTorch - Deep learning framework
  • Streamlit - Interactive dashboards

Prerequisites

This is an intensive bootcamp that requires dedication and consistent practice. Students should be prepared to:
  • Dedicate time daily to coding and exercises
  • Complete all module projects
  • Participate in live coding sessions
  • Ask questions and engage with the material
Recommended Background:
  • Basic computer literacy
  • Comfort with mathematical concepts (algebra, basic statistics)
  • Willingness to learn through hands-on practice
  • English reading comprehension (for technical documentation)
No prior programming experience is required - we start from the fundamentals!

Next Steps

Environment Setup

Set up your development environment with Python, virtual environments, and required libraries

Module A1

Start your journey with professional profile and bootcamp methodology

Build docs developers (and LLMs) love