Machine Learning using Python

Machine Learning – From Data to Deployment Course

By the end of this course, learners will be able to: Understand machine learning fundamentals. Perform data preprocessing, cleaning, and transformation. Apply feature selection & engineering to optimize models. Train and evaluate supervised, unsupervised, and semi-supervised models. Perform exploratory data analysis (EDA) and interpret results. Optimize ML models for accuracy, precision, recall, and efficiency. Build end-to-end ML pipelines for real-world problem-solving.

Syllabus Outline

Module 1:

Introduction to Machine Learning What is Machine Learning?

Types of ML: Supervised, Unsupervised, Semi-Supervised, Reinforcement Learning.

Module 2:

Problem Statement & Data Collection Defining a problem in ML terms. Sources of data (APIs, CSVs, Databases, Web Scraping). Introduction to datasets: Kaggle.

Module 3:

Data Preprocessing & Cleaning Handling missing values. Dealing with duplicates and inconsistencies. Encoding categorical variables (Label Encoding, One-Hot Encoding). Normalization vs Standardization.

Module 4:

Exploratory Data Analysis (EDA) Statistical summary of data. Data visualization (Matplotlib, Seaborn). Detecting outliers and anomalies. Correlation and relationships between features.

Module 5:

Feature Engineering & Feature Selection Creating new features from raw data. Feature scaling and transformation. Dimensionality reduction (PCA, LDA). Feature importance and selection techniques.

Module 6:

Model Training – Supervised Learning Regression: Linear Regression, Polynomial Regression. Classification: Logistic Regression, KNN, Decision Trees, Random Forest, SVM. Model training and hyperparameters.

Module 7:

Model Training – Unsupervised Learning Clustering: K-Means, Hierarchical Clustering. Association Rules (Apriori, Market Basket Analysis).

Module 8:

Semi-Supervised Learning Concept and applications. Self-training and label propagation methods. Case studies in real-world applications (fraud detection, text classification).

Module 9:

Model Evaluation & Optimization Train/test split, Cross-validation. Metrics: Accuracy, Precision, Recall, F1-Score, ROC-AUC. Overfitting vs Underfitting. Hyperparameter tuning (Grid Search, Random Search, Bayesian Optimization). Model explainability (SHAP, LIME).

Module 10:

End-to-End ML Project Capstone Project:

Build an ML pipeline from scratch Define problem statement. Collect data from a source. Perform cleaning & preprocessing. Apply EDA and feature engineering. Train and evaluate multiple models. Optimize best-performing model. Present findings and deploy a simple Flask.