DSBA Refresher HTML Index
Date: December 21, 2025 Author: P Baburaj Ambalam
Version 2.0 Comprehensive Q&A refreshers with examples and quizzes
Suggested Learning Path:
Foundation: Parametric vs Non-Parametric Models →
Tree-Based: Decision Trees → Random Forest → Boosting →
Engineering: Feature Engineering → ML Pipeline →
Clustering: K-means → Hierarchical/PCA
Tree-Based Models
-
Decision Trees
Beginner
Non-parametric models using recursive binary splits; interpretable but prone to overfitting
-
Decision Tree Data Analysis Guide
Hands-On Tutorial
Step-by-step practical guide: data loading, cleaning, feature importance, pruning, hyperparameter tuning, and cross-validation
-
Bagging and Random Forests
Intermediate
Bootstrap aggregating reduces variance; Random Forests add feature subsampling for decorrelation
-
Boosting
Intermediate
Sequential weak learners correct errors; powerful but sensitive to noise and hyperparameters
🔧 ML Engineering & Pipelines
🔍 Clustering & Dimensionality Reduction
About this collection: This collection starts with foundational concepts (parametric vs non-parametric models) then covers specific techniques. Each page includes technique descriptions at four levels (10-year-old, beginner, intermediate, expert), Q&A pairs, Python examples, 15-question quizzes with answers, practical checklists, common implementation errors, and authoritative references.