Bootcamp Fundamentos de Ciencia de Datos

Welcome to the Data Science Fundamentals Bootcamp! This comprehensive program takes you from programming basics through advanced machine learning and deep learning, covering all essential skills for a data science career.

Program Structure

The bootcamp is organized into 8 specialized modules (A1-A8), each building on previous knowledge to create a complete learning path. The program emphasizes practical, hands-on projects that mirror real-world data science workflows.

All modules include demos, exercises, and capstone projects to ensure you gain practical experience with each concept.

Module Breakdown

Module A1: Professional Profile and Methodology

Focus: Introduction to the data science profession and bootcamp methodology

Understanding the data science professional profile
Career paths and opportunities in data science
Bootcamp structure and learning methodology
Digital sobriety and responsible technology use
Setting up your professional development plan

Key Deliverables: Professional profile analysis, methodology understanding

Module A2: Python Fundamentals

Focus: Core Python programming for data science

Python installation and environment setup (venv/conda)
Python syntax, data types, and control structures
Functions, modules, and code organization
Object-oriented programming (OOP) concepts
File handling and JSON data persistence
Best practices: docstrings, naming conventions, code structure

Project: Contact Management System - A command-line application using OOP, data structures, and JSON persistence for managing client contacts with full CRUD operations and unit testing.

Module A3: Data Preparation with NumPy and Pandas

Focus: Data manipulation and preparation fundamentals

NumPy arrays and vectorized operations
Pandas DataFrames and Series
Data loading from multiple sources (CSV, Excel, web scraping)
Data cleaning: handling missing values, duplicates, outliers
Data transformation and feature engineering
Data consolidation and export

Project: E-commerce Data Preparation Pipeline - Complete data preparation workflow consolidating data from multiple sources, cleaning, transforming, and preparing datasets for analysis with optional Streamlit dashboard.

Module A4: Exploratory Data Analysis (EDA)

Focus: Statistical analysis and data visualization for business decisions

Descriptive statistics and data profiling
Data visualization with Matplotlib and Seaborn
Distribution analysis and pattern detection
Correlation analysis and feature relationships
Interactive dashboards with Streamlit
Communicating insights through visualizations

Project: ComercioYA Sales Analysis - Complete EDA on historical e-commerce sales data to support business decisions, including statistical analysis, visualizations, and an interactive Streamlit dashboard.

Module A5: Statistical Inference

Focus: Probability theory and hypothesis testing for data-driven decisions

Probability foundations and distributions
Random variables and probability calculations
Normal, binomial, and Poisson distributions
Central Limit Theorem and sampling distributions
Confidence intervals for population parameters
Hypothesis testing (t-tests, proportion tests)
Type I and Type II errors
P-values and statistical significance

Project: University Student Health Habits Study - Statistical inference project analyzing relationships between sleep, nutrition, physical activity, and academic performance using simulated data, hypothesis testing, and confidence intervals.

Module A6: Supervised Machine Learning - Regression

Focus: Building predictive models for continuous outcomes

Linear regression fundamentals
Polynomial regression and feature engineering
Regularization techniques (Ridge, Lasso, ElasticNet)
Model evaluation metrics (MAE, MSE, RMSE, R²)
Cross-validation and hyperparameter tuning with GridSearchCV
Ensemble methods: Gradient Boosting
Model selection and comparison
Production deployment considerations

Project: E-commerce Sales Prediction - Regression model predicting total sales per order using customer, product, and logistics data to support personalized campaigns, inventory management, and revenue forecasting.

Module A7: Unsupervised Machine Learning - Clustering

Focus: Pattern discovery and customer segmentation

Dimensionality reduction: PCA (Principal Component Analysis)
Non-linear dimensionality reduction: t-SNE
K-Means clustering and the elbow method
DBSCAN for density-based clustering
Hierarchical clustering and dendrograms
Silhouette score and cluster evaluation
Business interpretation of clusters
Outlier detection and handling

Project: Retail Customer Segmentation - Intelligent customer segmentation system using unsupervised learning to identify behavioral patterns, enabling personalized marketing campaigns and customer retention strategies.

Module A8: Deep Learning with Keras and PyTorch

Focus: Neural networks and deep learning frameworks

Introduction to neural networks and deep learning
Building models with Keras (TensorFlow backend)
Building models with PyTorch
Convolutional Neural Networks (CNNs) for image data
Recurrent Neural Networks (RNNs) for sequential data
Model training, validation, and optimization
Transfer learning and pre-trained models
Comparing Keras and PyTorch workflows

Project: Deep Learning implementation using both Keras and PyTorch frameworks, comparing approaches and architectures for real-world data problems.

Learning Approach

Hands-On Practice

Every module includes:

Live coding sessions with instructors
Demos showing real implementations
Exercises for immediate practice
Projects simulating real-world scenarios

Project-Based Learning

Each module culminates in a comprehensive project that:

Applies all concepts learned in the module
Uses real or realistic datasets
Follows industry best practices
Includes documentation and code quality standards
Can be included in your professional portfolio

Progressive Complexity

The curriculum is designed to build knowledge progressively:

Foundation (A1-A2): Professional context and programming basics
Data Skills (A3-A4): Data manipulation and exploration
Statistical Thinking (A5): Inference and hypothesis testing
Machine Learning (A6-A7): Supervised and unsupervised learning
Advanced AI (A8): Deep learning with modern frameworks

Technical Stack

Throughout the bootcamp, you’ll work with:

Python 3.8+ - Core programming language
Jupyter Notebooks - Interactive development environment
NumPy - Numerical computing
Pandas - Data manipulation and analysis
Matplotlib & Seaborn - Data visualization
Scikit-learn - Machine learning algorithms
TensorFlow/Keras - Deep learning framework
PyTorch - Deep learning framework
Streamlit - Interactive dashboards

Prerequisites

This is an intensive bootcamp that requires dedication and consistent practice. Students should be prepared to:

Dedicate time daily to coding and exercises
Complete all module projects
Participate in live coding sessions
Ask questions and engage with the material

Recommended Background:

Basic computer literacy
Comfort with mathematical concepts (algebra, basic statistics)
Willingness to learn through hands-on practice
English reading comprehension (for technical documentation)

No prior programming experience is required - we start from the fundamentals!

Next Steps

Environment Setup

Set up your development environment with Python, virtual environments, and required libraries

Module A1

Start your journey with professional profile and bootcamp methodology

Getting Started

Python Fundamentals

Data Preparation & Analysis

Statistical Inference

Machine Learning

Advanced Topics

Bootcamp Overview

Bootcamp Fundamentos de Ciencia de Datos

Program Structure

Module Breakdown

Module A1: Professional Profile and Methodology

Module A2: Python Fundamentals

Module A3: Data Preparation with NumPy and Pandas

Module A4: Exploratory Data Analysis (EDA)

Module A5: Statistical Inference

Module A6: Supervised Machine Learning - Regression

Module A7: Unsupervised Machine Learning - Clustering

Module A8: Deep Learning with Keras and PyTorch

Learning Approach

Hands-On Practice

Project-Based Learning

Progressive Complexity

Technical Stack

Prerequisites

Next Steps

Environment Setup

Module A1

Build docs developers (and LLMs) love

Getting Started

Python Fundamentals

Data Preparation & Analysis

Statistical Inference

Machine Learning

Advanced Topics

​Bootcamp Fundamentos de Ciencia de Datos

​Program Structure

​Module Breakdown

​Module A1: Professional Profile and Methodology

​Module A2: Python Fundamentals

​Module A3: Data Preparation with NumPy and Pandas

​Module A4: Exploratory Data Analysis (EDA)

​Module A5: Statistical Inference

​Module A6: Supervised Machine Learning - Regression

​Module A7: Unsupervised Machine Learning - Clustering

​Module A8: Deep Learning with Keras and PyTorch

​Learning Approach

​Hands-On Practice

​Project-Based Learning

​Progressive Complexity

​Technical Stack

​Prerequisites

​Next Steps

Environment Setup

Module A1

Build docs developers (and LLMs) love

Bootcamp Fundamentos de Ciencia de Datos

Program Structure

Module Breakdown

Module A1: Professional Profile and Methodology

Module A2: Python Fundamentals

Module A3: Data Preparation with NumPy and Pandas

Module A4: Exploratory Data Analysis (EDA)

Module A5: Statistical Inference

Module A6: Supervised Machine Learning - Regression

Module A7: Unsupervised Machine Learning - Clustering

Module A8: Deep Learning with Keras and PyTorch

Learning Approach

Hands-On Practice

Project-Based Learning

Progressive Complexity

Technical Stack

Prerequisites

Next Steps