Writing Machine Learning

Machine Learning

Most Machine Learning work fails for reasons that have nothing to do with the model. It fails in the features, in the validation, and in the gap between a notebook that scores well and a system that holds up on next week's data. These pieces are the working knowledge behind models that actually run.

The pipeline is the product

A model is one component in a longer chain. I map that chain in The 8-Layer Data Science Pipeline and Machine Learning Taxonomy and Building Blocks: problem framing, data, features, model, evaluation, and the decision the output is meant to inform. Most teams over-invest in the model layer and under-invest in the two layers on either side of it.

Where models actually break

If I had to name the one skill that separates good models from bad ones, it is feature engineering, which is why I wrote Feature Engineering: The Skill That Separates Good Models from Bad Ones. The second is honest validation. Overfitting Is Not a Model Problem, It's a Thinking Problem argues that overfitting is a discipline failure before it is a technical one, and Losses and Metrics in Machine Learning is about choosing the objective that matches the decision, not the one that is convenient to optimize.

Foundations worth having cold

Good intuition for the core methods pays off everywhere. I have written first-principles explanations of linear models, decision trees and ensembles, support vector machines, clustering, dimensionality reduction, and the neural network training playbook. The common thread is geometry and assumptions first, library calls second.

Thinking in probability, and explaining the result

Two habits separate practitioners from people who run scripts. The first is treating probability as an operating system for decisions rather than a formula applied at the end. The second is being able to say why a model did what it did, the subject of Explainable AI in Practice. For specialized problems I have also written on anomaly detection, count data and probabilistic forecasting, scaling across data and compute, and deeper dives like differentiation in TensorFlow, deep learning for image tasks, and useful concepts such as calibration and RANSAC.

This is the work behind Production Machine Learning & Data Infrastructure, and you can see it running in production in the supply-chain forecasting case study.

All articles in this topic

machine-learning 10 min read 22 Feb 2024

Anomaly Detection: A Practical Framework

Statistical and Machine Learning approaches to anomaly detection - Isolation Forest, DBSCAN, autoencoders, time-series methods - and how to choose between them based on your data structure and constraints.

Read →
machine-learning 9 min read 15 Feb 2024

Explainable AI in Practice

When model explanations actually matter and when they don't - a practitioner's guide to SHAP, LIME, attention visualization, and the hard questions about trust and accountability in machine learning systems.

Read →
machine-learning 12 min read 5 Feb 2024

Losses and Metrics in Machine Learning

A practical reference covering every major loss function and evaluation metric - what each one measures, when to use it, and what it gets wrong.

Read →
machine-learning 12 min read 1 Feb 2024

Neural Network Training Playbook

A practitioner's guide to training neural networks - from initialization and optimization to regularization, debugging, and the decisions that actually determine whether your model converges.

Read →
machine-learning 10 min read 1 Feb 2024

Overfitting Is Not a Model Problem, It's a Thinking Problem

The bias-variance tradeoff reframed as a failure of reasoning, not tuning. Why overfitting in quantitative finance is uniquely dangerous, and how to detect and prevent it systematically.

Read →
machine-learning 11 min read 28 Jan 2024

Count Data Models and Probabilistic Forecasting

When your target variable is a non-negative integer, standard regression breaks down. A practical guide to Poisson, negative binomial, and zero-inflated models - and when each one applies.

Read →
machine-learning 10 min read 25 Jan 2024

Probability as an Operating System for Better Decisions

Bayesian reasoning, belief updating, and calibrated uncertainty - how probabilistic thinking changes the way you interpret evidence and make decisions under uncertainty.

Read →
machine-learning 14 min read 22 Jan 2024

Decision Trees and Ensembles: Intuition First

How decision trees work, why they overfit, and how ensemble methods - bagging, boosting, and stacking - transform weak learners into the models that dominate tabular Machine Learning competitions.

Read →
machine-learning 12 min read 20 Jan 2024

Machine Learning Taxonomy and Building Blocks

A reference-first guide to the full landscape of machine learning - problem types, algorithm families, and the four universal components that every Machine Learning system shares.

Read →
machine-learning 14 min read 15 Jan 2024

Feature Engineering: The Skill That Separates Good Models from Bad Ones

A practitioner's guide to feature engineering - the craft of transforming raw data into model-ready representations that capture what actually matters for the prediction task.

Read →
machine-learning 10 min read 1 Jan 2024

The 8-Layer Data Science Pipeline

A practitioner's map of the complete data science workflow - from problem framing and data collection to deployment and monitoring - with what actually goes wrong at each stage.

Read →
machine-learning 12 min read 1 Feb 2023

Deep Learning for Image Tasks: Detection vs. Segmentation

A practical map of the deep learning landscape for image understanding - object detection vs. semantic segmentation, the key architectures for each, and which metrics to use.

Read →
machine-learning 8 min read 20 Jan 2023

Differentiation in TensorFlow: GradientTape and Custom Training Loops

How TensorFlow's automatic differentiation works under the hood, when to use GradientTape over Keras fit(), and how to build custom training loops for research and production models.

Read →
machine-learning 9 min read 15 Jan 2023

Scaling Machine Learning: Data, Compute, and Systems

How machine learning systems scale across three dimensions - data volume, model size, and inference throughput - and the engineering tradeoffs at each level.

Read →
machine-learning 7 min read 11 Jan 2023

Useful Machine Learning Concepts: Calibration, RANSAC, and the Loss Minimization Framework

Three underrated concepts that separate production-ready Machine Learning from research prototypes - probability calibration, robust model fitting with RANSAC, and understanding all Machine Learning algorithms as variations on a single loss minimization framework.

Read →
machine-learning 7 min read 8 Jan 2023

Dimensionality Reduction: PCA, t-SNE, and UMAP

A practical guide to the three main dimensionality reduction techniques - when to use each, what they preserve, and how to avoid the common mistake of using t-SNE embeddings as features.

Read →
machine-learning 10 min read 3 Jan 2023

Clustering: Algorithms, Tradeoffs, and When to Use Each

A technical reference for the three main clustering families - density-based (DBSCAN), centroid-based (K-Means), and hierarchical - covering their mathematical foundations, hyperparameter selection, and failure modes.

Read →
machine-learning 11 min read 31 Dec 2022

Support Vector Machines: Geometry, Kernels, and Practical Tradeoffs

SVMs from first principles - the margin maximization objective, soft margins, the kernel trick, and the practical cases where SVMs outperform and where they don't.

Read →
machine-learning 6 min read 2 Jan 2022

Linear Models: Regression, Loss Functions, and the Gaussian Assumption

The mathematical foundation of linear regression and logistic regression - what they optimize, what assumptions they make, and why understanding these fundamentals matters for every model built on top of them.

Read →

Related case studies

ml 2022–2023

7 Production Forecasting Models Driving Replenishment & Markdown Decisions for Blue Yonder's Enterprise Retailers (5TB+)

Logistics & Supply Chain (SaaS)

7 production forecasting models deployed on a GCP/TFX/Apache Beam/Dataflow stack processing 5TB+ of live supply-chain data, served to enterprise retail clients through Blue Yonder's SaaS platform by a 15-member team.

supply-chain forecasting production-ml
Read case study →
research Nov–Dec 2019

Temporal Attention for Link Prediction on Dynamic Graphs - 86% AUC at NUS

Academic Research

A from-scratch PyTorch temporal attention model reached 86% AUC on College Messages and outperformed node2vec, TMF, CTDNE, and BANE under one shared, leak-free evaluation protocol.

graph-ml research temporal-graphs
Read case study →
ml Applied Project

Predicting Missing Friendship Links from Social Graph Structure

Personal Research Project

AUC-ROC evaluation showing that pair-level structural features (Adamic-Adar, common neighbors, shortest path) decisively outperform individual node-level features for link prediction, with Adamic-Adar the single strongest predictor.

graph-ml link-prediction social-networks
Read case study →
ml Applied Project

Detecting Semantically Duplicate Questions Despite Different Wording (Quora Question Pairs)

NLP Applied Project

An end-to-end pipeline that combines hand-crafted semantic features (lexical overlap, TF-IDF cosine, fuzzy matching, length) with Word2Vec embedding distances and a weight-shared siamese LSTM, ensembled together - with an ablation that isolates where each family carries signal.

nlp semantic-similarity deep-learning
Read case study →
ml Applied Project

Classifying Cancer Mutations from Clinical Text (MSKCC Challenge)

Biomedical Research (Applied Project)

A working multi-class pipeline over clinical text and gene/mutation features: TF-IDF + linear/tree/Naive Bayes models, stratified 64/16/20 splits, class-weighted training, and log-loss evaluation with documented model comparison.

nlp classification biomedical
Read case study →
ml Personal Project

Why TCIA Cancer Imaging Won't Carry a Clinical Screening Tool: A Feasibility Study

Personal Research Project

Early-stage EDA on TCIA revealed the binding constraints were in the data, not the model - annotation inconsistency across contributing institutions and distribution shift between scanners and patient populations - leading to a documented, evidence-based decision to halt the project rather than build a model whose benchmark AUC would overstate clinical viability.

biomedical medical-imaging feasibility-study
Read case study →
ml Applied Project

Predicting Hydroponic Crop Yield from Sensor Data - and Turning It Into Planting Decisions

Self-Directed Applied Project

An end-to-end pipeline from raw hourly sensor logs to a weekly planting-recommendation matrix, with feature-importance analysis isolating nitrogen concentration and cumulative light exposure as the two highest-impact controllable yield drivers.

forecasting ml agriculture
Read case study →

Want this kind of work in your shop?

Production Machine Learning & Data Infrastructure →

Have a problem worth solving?

Whether you need a quantitative researcher, a Machine Learning systems builder, or a technical advisor, I take a small number of consulting engagements at a time.

Book a call →