Machine Learning – From Data to Deployment Course
By the end of this course, learners will be able to: Understand machine learning fundamentals. Perform data preprocessing, cleaning, and transformation. Apply feature selection & engineering to optimize models. Train and evaluate supervised, unsupervised, and semi-supervised models. Perform exploratory data analysis (EDA) and interpret results. Optimize ML models for accuracy, precision, recall, and efficiency. Build end-to-end ML pipelines for real-world problem-solving.
Syllabus Outline
Module 1:
Introduction to Machine Learning What is Machine Learning?
Types of ML: Supervised, Unsupervised, Semi-Supervised, Reinforcement Learning.
Module 2:
Problem Statement & Data Collection Defining a problem in ML terms. Sources of data (APIs, CSVs, Databases, Web Scraping). Introduction to datasets: Kaggle.
Module 3:
Data Preprocessing & Cleaning Handling missing values. Dealing with duplicates and inconsistencies. Encoding categorical variables (Label Encoding, One-Hot Encoding). Normalization vs Standardization.
Module 4:
Exploratory Data Analysis (EDA) Statistical summary of data. Data visualization (Matplotlib, Seaborn). Detecting outliers and anomalies. Correlation and relationships between features.
Module 5:
Feature Engineering & Feature Selection Creating new features from raw data. Feature scaling and transformation. Dimensionality reduction (PCA, LDA). Feature importance and selection techniques.
Module 6:
Model Training – Supervised Learning Regression: Linear Regression, Polynomial Regression. Classification: Logistic Regression, KNN, Decision Trees, Random Forest, SVM. Model training and hyperparameters.
Module 7:
Model Training – Unsupervised Learning Clustering: K-Means, Hierarchical Clustering. Association Rules (Apriori, Market Basket Analysis).
Module 8:
Semi-Supervised Learning Concept and applications. Self-training and label propagation methods. Case studies in real-world applications (fraud detection, text classification).
Module 9:
Model Evaluation & Optimization Train/test split, Cross-validation. Metrics: Accuracy, Precision, Recall, F1-Score, ROC-AUC. Overfitting vs Underfitting. Hyperparameter tuning (Grid Search, Random Search, Bayesian Optimization). Model explainability (SHAP, LIME).
Module 10:
End-to-End ML Project Capstone Project:
Build an ML pipeline from scratch Define problem statement. Collect data from a source. Perform cleaning & preprocessing. Apply EDA and feature engineering. Train and evaluate multiple models. Optimize best-performing model. Present findings and deploy a simple Flask.
