Taiju Sanagi: Experiments
Note

Softmax Regression

April 29, 2025

This note introduces the Softmax Regression algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is Softmax Regression? Softmax Regression (also called...

Read Note
Note

Entropy and Information Gain

April 28, 2025

Overview When building a Decision Tree, we aim to split data into increasingly pure groups. But how can we measure "purity" or "impurity" mathematically? One powerful way is using entropy, a concept from information theory. Entropy measures how mixed or uncertain a group is. When we split data, we...

Read Note
Note

Regression Metrics

April 28, 2025

Quick Summary SSE: Sensitive to large errors, not same unit (squared unit). MSE: Sensitive to large errors, not same unit (squared unit). MAE: Same unit as target, treats all errors equally. RMSE: Same unit as target, sensitive to large errors. Key Points SSE (Sum of Squared Errors) $$ \text{SSE} =...

Read Note
Note

Linear Regression Direct Solution

April 28, 2025

Step 1: Define the Cost Function We want to minimize the error between predictions and true values. $$ J(w) = \frac{1}{2} | Xw - t |^2 $$ ✅ Meaning: $Xw$: predicted values $t$: true target values $Xw - t$: error vector $| \cdot |^2$: sum of squared errors $\frac{1}{2}$: for convenient derivative...

Read Note
Note

Unsupervised Learning

April 22, 2025

🧩 Introduction to Unsupervised Learning Unsupervised learning is a type of machine learning where the model learns from data without any labels. There are no answers provided — just raw input data. The goal is to discover patterns, structures, or groupings hidden inside the data. 1\. How Is It...

Read Note
Note

Supervised Learning

April 22, 2025

🧠 Introduction to Supervised Learning Supervised learning is one of the main branches of machine learning. It refers to training a model using data where the correct answers (called labels) are already known. 1\. What Does “Supervised” Mean? It’s called supervised because the model learns from...

Read Note
Note

Bias Variance Tradeoff

April 22, 2025

🎯 Bias–Variance Tradeoff In machine learning, we want our model to learn useful patterns — not just memorize the data or oversimplify it. The bias–variance tradeoff helps us understand the balance between underfitting and overfitting. 1\. What Is Bias? Bias is the error caused by using a model...

Read Note
Note

Decision Boundary

April 22, 2025

🔀 Decision Boundary In classification problems, a decision boundary is the surface (line in 2D) that separates different predicted classes. It’s the point where the model is undecided — where the prediction flips from one class to another. 1\. Why Do We Need It? In regression, we predict a...

Read Note
Note

Anomaly Detection

April 22, 2025

🚨 Introduction to Anomaly Detection Anomaly detection is about finding things that don’t belong. These could be: A fraudulent credit card transaction A faulty machine sensor reading An unusual customer behavior We want to identify rare, unusual patterns that are different from the normal data —...

Read Note
Note

Manifold Learning

April 22, 2025

🌐 Understanding Manifold Learning PCA is powerful — but it assumes that the important structure in the data lies along straight (linear) directions. But what if the data lies on a curved surface inside a high-dimensional space? This is where manifold learning comes in. 1\. Why PCA Isn't Always...

Read Note
Note

Principal Component Analysis (PCA)

April 22, 2025

This note introduces the Principal Component Analysis (PCA) technique using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and intuitive to build. What is PCA? Principal Component Analysis...

Read Note
Note

Hierarchical Clustering

April 21, 2025

This note introduces the Hierarchical Clustering algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates an intuitive from-scratch-like visualization to show that the core idea is simple and easy to understand. What is Hierarchical...

Read Note
Note

AdaBoost

April 21, 2025

This note introduces the AdaBoost algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is intuitive and builds naturally on weak learners like Decision Stumps. What is AdaBoost? AdaBoost...

Read Note
Note

Random Forest

April 21, 2025

This note introduces the Random Forest algorithm using scikit‑learn, explains the step‑by‑step logic behind how it works, and then demonstrates a from‑scratch implementation to show that the core idea is simple and builds naturally on Decision Trees. What is a Random Forest? A Random Forest is an...

Read Note
Note

Gaussian Naive Bayes

April 21, 2025

This note introduces the Gaussian Naive Bayes algorithm using scikit‑learn, explains the step‑by‑step logic behind how it works, and then demonstrates a from‑scratch implementation to show that the core idea is simple and easy to build. What is Gaussian Naive Bayes? Gaussian Naive Bayes is a...

Read Note
Note

Multinomial Naive Bayes

April 21, 2025

This note introduces the Multinomial Naive Bayes algorithm using scikit‑learn, explains the step‑by‑step logic behind how it works, and then demonstrates a from‑scratch implementation to show that the core idea is simple and easy to build. What is Multinomial Naive Bayes? Multinomial Naive Bayes is...

Read Note
Note

Logistic Regression

April 21, 2025

This note introduces the Logistic Regression algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is Logistic Regression? Logistic Regression is a method...

Read Note
Note

Lasso Regression

April 21, 2025

This note introduces the Lasso Regression algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is Lasso Regression? Lasso Regression is like Linear...

Read Note
Note

Ridge Regression

April 20, 2025

This note introduces the Ridge Regression algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is Ridge Regression? Ridge Regression is like Linear...

Read Note
Note

Polynomial Regression

April 20, 2025

This note introduces the Polynomial Regression algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is Polynomial Regression? Polynomial Regression is like...

Read Note
Note

Perceptron

April 20, 2025

This note introduces the Perceptron algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is the Perceptron? The Perceptron is one of the earliest algorithms...

Read Note
Note

DBSCAN

April 19, 2025

This note introduces the DBSCAN clustering algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and powerful. What is DBSCAN? DBSCAN (Density-Based Spatial Clustering of...

Read Note
Note

K-Means Clustering

April 19, 2025

This note introduces the K-Means Clustering algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is K-Means? K-Means Clustering is like grouping similar...

Read Note
Note

Decision Tree

April 19, 2025

This note introduces the Decision Tree algorithm using scikit‑learn, explains the step‑by‑step logic behind how it works, and then demonstrates a from‑scratch implementation to show that the core idea is simple and easy to build. What is a Decision Tree? A Decision Tree is a flow‑chart full of...

Read Note
Note

Bernoulli Naive Bayes

April 19, 2025

This note introduces the Bernoulli Naive Bayes algorithm using scikit‑learn, explains the step‑by‑step logic behind how it works, and then demonstrates a from‑scratch implementation to show that the core idea is simple and easy to build. What is Bernoulli Naive Bayes? Bernoulli Naive Bayes is like...

Read Note
Note

Linear Regression

April 18, 2025

This note introduces the Linear Regression algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is Linear Regression? Linear Regression is like drawing the...

Read Note
Demo

K-Nearest Neighbors (KNN) Viz

April 17, 2025
View Demo
Note

K-Nearest Neighbors (KNN)

April 16, 2025

This note introduces the KNN algorithm using scikit-learn, explains the step-by-step logic behind how it works, and then demonstrates a from-scratch implementation to show that the core idea is simple and easy to build. What is KNN? K-Nearest Neighbors (KNN) is like asking your neighbors for advice...

Read Note