Machine Learning Course Projects
Practice repository covering supervised learning, classification, regression, K-Means, PCA, TensorFlow/Keras neural networks, preprocessing, train/test splitting, and evaluation metrics.
Hands-on machine learning projects built while working through a structured ML curriculum. The focus is on understanding why each technique works, not just calling .fit().
Topics covered
Supervised learning
- Linear regression — closed-form solution vs gradient descent, cost function minimization
- Logistic regression — sigmoid function, binary cross-entropy loss, decision boundary
- Classification — multi-class with softmax, one-vs-rest
Tree-based methods
- Decision trees: information gain, Gini impurity, overfitting and pruning
- Random forests: bagging, feature importance, hyperparameter tuning
Unsupervised learning
- K-Means clustering — elbow method for k selection, convergence, cluster quality metrics
- PCA (Principal Component Analysis) — eigendecomposition, explained variance ratio, dimensionality reduction for visualization and feature compression
Neural networks (TensorFlow/Keras)
- Sequential API: Dense layers, activation functions (ReLU, sigmoid, softmax)
- Training:
model.compile,model.fit, callbacks (EarlyStopping, ModelCheckpoint) - Preventing overfitting: Dropout, L2 regularisation, batch normalisation
Data pipeline
- Pandas for loading, cleaning, and exploring tabular data
- NumPy for vectorised operations and matrix math
- Train/validation/test splitting,
StandardScaler, one-hot encoding withpd.get_dummies - Evaluation: accuracy, precision, recall, F1, ROC-AUC, confusion matrix
Key insight
Machine learning is applied linear algebra and statistics. Understanding the math (gradient descent as finding the slope of the loss surface, PCA as variance maximisation) makes it possible to debug models that aren’t converging — rather than just trying random hyperparameters.