Writing

Quantitative finance, machine learning systems, data engineering, and industry analysis. Written from production experience — not from tutorials.

industry 8 min read 15 Mar 2024

How Banks Work: A Data Scientist's Map

A data scientist's guide to banking business models, data architecture, and regulatory constraints — built from time spent automating regulatory data pipelines across 70+ source systems at ANZ Bank.

Read →
quant 12 min read 15 Mar 2024

Volatility Surfaces and What They Tell You About the Market

A practitioner's guide to implied volatility surfaces — how to construct them, what their shape encodes about market consensus, and how to extract trading edges from the information they contain.

Read →
data-engineering 8 min read 1 Mar 2024

The ML Project Lifecycle: What Actually Happens vs. What People Think

The real timeline of an ML project — how problem framing, data work, and deployment consistently dominate modeling time, and what this means for how to structure ML teams and projects.

Read →
quant 18 min read 1 Mar 2024

The Quantitative Trading Playbook

A practitioner's guide to building systematic trading strategies that hold up out-of-sample — from signal research and backtesting discipline to execution modeling and live deployment.

Read →
ml 10 min read 22 Feb 2024

Anomaly Detection: A Practical Framework

Statistical and ML approaches to anomaly detection — Isolation Forest, DBSCAN, autoencoders, time-series methods — and how to choose between them based on your data structure and constraints.

Read →
industry 9 min read 20 Feb 2024

Product Analytics: The Pitfalls No One Warns You About

Survivorship bias in A/B tests, Goodhart's Law in metrics, novelty effects, and the causal inference problems that make product analytics harder than it looks.

Read →
data-engineering 10 min read 18 Feb 2024

SQL for Data Scientists: The Patterns That Actually Matter

Window functions, CTEs, time-series queries, and optimization techniques — SQL patterns that data scientists use daily but often learn inefficiently from tutorial sites that stop at basic SELECT.

Read →
data-engineering 8 min read 15 Feb 2024

BigQuery vs TensorFlow Transform: Choosing the Right Feature Pipeline

When to compute features in BigQuery versus TFX — the tradeoffs between SQL-native simplicity and training-serving skew prevention, based on real experience at Blue Yonder.

Read →
ml 9 min read 15 Feb 2024

Explainable AI in Practice

When model explanations actually matter and when they don't — a practitioner's guide to SHAP, LIME, attention visualization, and the hard questions about trust and accountability in machine learning systems.

Read →
quant 10 min read 15 Feb 2024

Options Greeks as Risk Dials

A working practitioner's guide to delta, gamma, theta, vega, and rho — not as formulas to memorize but as risk instruments to manage in a live options book.

Read →
data-engineering 11 min read 12 Feb 2024

Python Performance: Writing Code That Scales

Practical techniques for making Python code fast — NumPy vectorization, Numba JIT compilation, multiprocessing, profiling tools, and common patterns that silently kill performance.

Read →
quant 10 min read 8 Feb 2024

The Trader Mindset: Discipline, Systems, and Market Dynamics

What separates systematic traders from discretionary ones — market regime classification, position sizing, psychological discipline, and why most people fail at trading even when they're smart.

Read →
ml 12 min read 5 Feb 2024

Losses and Metrics in Machine Learning

A practical reference covering every major loss function and evaluation metric — what each one measures, when to use it, and what it gets wrong.

Read →
ml 12 min read 1 Feb 2024

Neural Network Training Playbook

A practitioner's guide to training neural networks — from initialization and optimization to regularization, debugging, and the decisions that actually determine whether your model converges.

Read →
ml 10 min read 1 Feb 2024

Overfitting Is Not a Model Problem, It's a Thinking Problem

The bias-variance tradeoff reframed as a failure of reasoning, not tuning. Why overfitting in quantitative finance is uniquely dangerous, and how to detect and prevent it systematically.

Read →
ml 11 min read 28 Jan 2024

Count Data Models and Probabilistic Forecasting

When your target variable is a non-negative integer, standard regression breaks down. A practical guide to Poisson, negative binomial, and zero-inflated models — and when each one applies.

Read →
ml 10 min read 25 Jan 2024

Probability as an Operating System for Better Decisions

Bayesian reasoning, belief updating, and calibrated uncertainty — how probabilistic thinking changes the way you interpret evidence and make decisions under uncertainty.

Read →
ml 14 min read 22 Jan 2024

Decision Trees and Ensembles: Intuition First

How decision trees work, why they overfit, and how ensemble methods — bagging, boosting, and stacking — transform weak learners into the models that dominate tabular ML competitions.

Read →
quant 13 min read 20 Jan 2024

Building a Backtesting Framework That Doesn't Lie to You

The common mistakes that make backtests look better than reality — and the engineering disciplines that close the gap between simulated and live performance.

Read →
industry 11 min read 20 Jan 2024

Data Science in Supply Chain: What the Models Actually Do

A practitioner's overview of how ML is applied in supply chain — from demand forecasting and inventory optimization to markdown pricing and fulfillment capacity, with what the models can and can't solve.

Read →
ml 12 min read 20 Jan 2024

ML Taxonomy and Building Blocks

A reference-first guide to the full landscape of machine learning — problem types, algorithm families, and the four universal components that every ML system shares.

Read →
ml 14 min read 15 Jan 2024

Feature Engineering: The Skill That Separates Good Models from Bad Ones

A practitioner's guide to feature engineering — the craft of transforming raw data into model-ready representations that capture what actually matters for the prediction task.

Read →
data-engineering 10 min read 10 Jan 2024

The Data Engineering Stack: A Practitioner's Map

A structured map of the data engineering landscape — from OS fundamentals and SQL through distributed compute, streaming, cloud services, and orchestration. Built from real project experience across Blue Yonder, Mastertrust, and independent data systems work.

Read →
ml 10 min read 1 Jan 2024

The 8-Layer Data Science Pipeline

A practitioner's map of the complete data science workflow — from problem framing and data collection to deployment and monitoring — with what actually goes wrong at each stage.

Read →
data-engineering 9 min read 10 Feb 2023

Distributed Training in TensorFlow: MirroredStrategy vs. ParameterServerStrategy

A practical guide to TensorFlow's distribution strategies — how each works, when to use MirroredStrategy vs. ParameterServerStrategy, and the tradeoffs that determine which is faster.

Read →
ml 12 min read 1 Feb 2023

Deep Learning for Image Tasks: Detection vs. Segmentation

A practical map of the deep learning landscape for image understanding — object detection vs. semantic segmentation, the key architectures for each, and which metrics to use.

Read →
ml 8 min read 20 Jan 2023

Differentiation in TensorFlow: GradientTape and Custom Training Loops

How TensorFlow's automatic differentiation works under the hood, when to use GradientTape over Keras fit(), and how to build custom training loops for research and production models.

Read →
industry 6 min read 17 Jan 2023

Building Information Modeling and Machine Learning

How BIM creates a structured digital twin across a building's full lifecycle — and where ML applications in predictive maintenance, energy optimization, and construction quality control are emerging.

Read →
ml 9 min read 15 Jan 2023

Scaling Machine Learning: Data, Compute, and Systems

How machine learning systems scale across three dimensions — data volume, model size, and inference throughput — and the engineering tradeoffs at each level.

Read →
ml 7 min read 11 Jan 2023

Useful ML Concepts: Calibration, RANSAC, and the Loss Minimization Framework

Three underrated concepts that separate production-ready ML from research prototypes — probability calibration, robust model fitting with RANSAC, and understanding all ML algorithms as variations on a single loss minimization framework.

Read →
ml 7 min read 8 Jan 2023

Dimensionality Reduction: PCA, t-SNE, and UMAP

A practical guide to the three main dimensionality reduction techniques — when to use each, what they preserve, and how to avoid the common mistake of using t-SNE embeddings as features.

Read →
ml 10 min read 3 Jan 2023

Clustering: Algorithms, Tradeoffs, and When to Use Each

A technical reference for the three main clustering families — density-based (DBSCAN), centroid-based (K-Means), and hierarchical — covering their mathematical foundations, hyperparameter selection, and failure modes.

Read →
ml 11 min read 31 Dec 2022

Support Vector Machines: Geometry, Kernels, and Practical Tradeoffs

SVMs from first principles — the margin maximization objective, soft margins, the kernel trick, and the practical cases where SVMs outperform and where they don't.

Read →
productivity 4 min read 17 Dec 2022

Data Structures and Algorithms: Complexity and Resources

The core intuition behind space-time complexity analysis, with a guide to the best resources for building DSA fundamentals as a data scientist.

Read →
quant 8 min read 1 Jun 2022

Asset Classes and Algorithmic Trading Paradigms

A structured survey of how major asset classes behave and what drives their systematic trading opportunities — from cash equities and fixed income to derivatives, commodities, and forex.

Read →
data-engineering 14 min read 22 Apr 2022

Building Production ML Pipelines with TFX

A ground-up walkthrough of TensorFlow Extended — orchestrators, metadata, standard components (ExampleGen through Pusher), and building custom components. Written from hands-on work building ML pipelines in 2022.

Read →
industry 10 min read 4 Jan 2022

India Export-Import: Regulatory Framework and Trade Mechanics

A practical overview of India's export-import regulatory framework — IEC, FEMA, Incoterms 2020, trade documentation, government incentive schemes, and payment instruments.

Read →
productivity 4 min read 2 Jan 2022

JupyterLab: Remote Access, Extensions, and Productivity Setup

Practical JupyterLab configuration for data science work — remote access over the network, useful extensions for visualization and productivity, and embedding media in notebooks.

Read →
ml 6 min read 2 Jan 2022

Linear Models: Regression, Loss Functions, and the Gaussian Assumption

The mathematical foundation of linear regression and logistic regression — what they optimize, what assumptions they make, and why understanding these fundamentals matters for every model built on top of them.

Read →

Lets collaborate!

Whether you need a quantitative researcher, an machine learning systems builder, or a technical advisor — I'm available for select consulting engagements.

Get in Touch →