Bootcamp Fundamentos de Ciencia de Datos
Welcome to the Data Science Fundamentals Bootcamp! This comprehensive program takes you from programming basics through advanced machine learning and deep learning, covering all essential skills for a data science career.Program Structure
The bootcamp is organized into 8 specialized modules (A1-A8), each building on previous knowledge to create a complete learning path. The program emphasizes practical, hands-on projects that mirror real-world data science workflows.All modules include demos, exercises, and capstone projects to ensure you gain practical experience with each concept.
Module Breakdown
Module A1: Professional Profile and Methodology
Focus: Introduction to the data science profession and bootcamp methodology- Understanding the data science professional profile
- Career paths and opportunities in data science
- Bootcamp structure and learning methodology
- Digital sobriety and responsible technology use
- Setting up your professional development plan
Module A2: Python Fundamentals
Focus: Core Python programming for data science- Python installation and environment setup (venv/conda)
- Python syntax, data types, and control structures
- Functions, modules, and code organization
- Object-oriented programming (OOP) concepts
- File handling and JSON data persistence
- Best practices: docstrings, naming conventions, code structure
Module A3: Data Preparation with NumPy and Pandas
Focus: Data manipulation and preparation fundamentals- NumPy arrays and vectorized operations
- Pandas DataFrames and Series
- Data loading from multiple sources (CSV, Excel, web scraping)
- Data cleaning: handling missing values, duplicates, outliers
- Data transformation and feature engineering
- Data consolidation and export
Module A4: Exploratory Data Analysis (EDA)
Focus: Statistical analysis and data visualization for business decisions- Descriptive statistics and data profiling
- Data visualization with Matplotlib and Seaborn
- Distribution analysis and pattern detection
- Correlation analysis and feature relationships
- Interactive dashboards with Streamlit
- Communicating insights through visualizations
Module A5: Statistical Inference
Focus: Probability theory and hypothesis testing for data-driven decisions- Probability foundations and distributions
- Random variables and probability calculations
- Normal, binomial, and Poisson distributions
- Central Limit Theorem and sampling distributions
- Confidence intervals for population parameters
- Hypothesis testing (t-tests, proportion tests)
- Type I and Type II errors
- P-values and statistical significance
Module A6: Supervised Machine Learning - Regression
Focus: Building predictive models for continuous outcomes- Linear regression fundamentals
- Polynomial regression and feature engineering
- Regularization techniques (Ridge, Lasso, ElasticNet)
- Model evaluation metrics (MAE, MSE, RMSE, R²)
- Cross-validation and hyperparameter tuning with GridSearchCV
- Ensemble methods: Gradient Boosting
- Model selection and comparison
- Production deployment considerations
Module A7: Unsupervised Machine Learning - Clustering
Focus: Pattern discovery and customer segmentation- Dimensionality reduction: PCA (Principal Component Analysis)
- Non-linear dimensionality reduction: t-SNE
- K-Means clustering and the elbow method
- DBSCAN for density-based clustering
- Hierarchical clustering and dendrograms
- Silhouette score and cluster evaluation
- Business interpretation of clusters
- Outlier detection and handling
Module A8: Deep Learning with Keras and PyTorch
Focus: Neural networks and deep learning frameworks- Introduction to neural networks and deep learning
- Building models with Keras (TensorFlow backend)
- Building models with PyTorch
- Convolutional Neural Networks (CNNs) for image data
- Recurrent Neural Networks (RNNs) for sequential data
- Model training, validation, and optimization
- Transfer learning and pre-trained models
- Comparing Keras and PyTorch workflows
Learning Approach
Hands-On Practice
Every module includes:- Live coding sessions with instructors
- Demos showing real implementations
- Exercises for immediate practice
- Projects simulating real-world scenarios
Project-Based Learning
Each module culminates in a comprehensive project that:- Applies all concepts learned in the module
- Uses real or realistic datasets
- Follows industry best practices
- Includes documentation and code quality standards
- Can be included in your professional portfolio
Progressive Complexity
The curriculum is designed to build knowledge progressively:- Foundation (A1-A2): Professional context and programming basics
- Data Skills (A3-A4): Data manipulation and exploration
- Statistical Thinking (A5): Inference and hypothesis testing
- Machine Learning (A6-A7): Supervised and unsupervised learning
- Advanced AI (A8): Deep learning with modern frameworks
Technical Stack
Throughout the bootcamp, you’ll work with:- Python 3.8+ - Core programming language
- Jupyter Notebooks - Interactive development environment
- NumPy - Numerical computing
- Pandas - Data manipulation and analysis
- Matplotlib & Seaborn - Data visualization
- Scikit-learn - Machine learning algorithms
- TensorFlow/Keras - Deep learning framework
- PyTorch - Deep learning framework
- Streamlit - Interactive dashboards
Prerequisites
Recommended Background:- Basic computer literacy
- Comfort with mathematical concepts (algebra, basic statistics)
- Willingness to learn through hands-on practice
- English reading comprehension (for technical documentation)
Next Steps
Environment Setup
Set up your development environment with Python, virtual environments, and required libraries
Module A1
Start your journey with professional profile and bootcamp methodology