Research Papers by Topic
Graph-Based Methods
- DeepWalk: Online Learning of Social Representations - the foundational paper for random walk-based graph embeddings
- HOPE: Asymmetric Transitivity Preserving Graph Embedding - preserves directed graph asymmetry that DeepWalk misses
- Feature Extraction for Graphs - practical overview of hand-crafted graph features
- A Overview of FE in Graphs
Training Methodology
- A Disciplined Approach to Neural Network Hyper-Parameters - Smith’s CLR paper; the systematic approach to learning rate, batch size, momentum, weight decay that replaces trial-and-error
Loss Functions
- Comprehensive Survey of Loss Functions in Machine Learning
- Focal Loss for Dense Object Detection - RetinaNet’s class-imbalance solution for object detection
- Recall Loss for Semantic Segmentation
- Regression Based Loss Functions for Time Series Forecasting
Knowledge Distillation
- Distilling the Knowledge in a Neural Network (Hinton 2015) - the original distillation paper
- Survey: Knowledge Distillation
- DistilBERT - BERT compressed to 60% size with 97% performance
Mixture of Experts
- Language-Image Mixture of Experts - multimodal MoE architecture
- LIMoE: Learning Multiple Modalities with One Sparse Mixture-of-Experts Model
Deep Learning for Tabular Data
- Revisiting Deep Learning Models for Tabular Data - systematic comparison showing tree models still dominate for many tabular tasks; FT-Transformer as the best neural baseline
- TabNet - attention-based tabular model with built-in feature selection
Embeddings
- Time2Vec: Learning a Vector Representation of Time - position encoding for time series; periodic and non-periodic components
Attention and Transformers
- Neural Machine Translation by Jointly Learning to Align and Translate - the original attention paper (Bahdanau 2014)
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
- The Illustrated Transformer - the best visual explanation of the transformer architecture
Explainable AI
- LIME: Explaining the Predictions of Any Classifier
- Integrated Gradients
- Interpretable Machine Learning (Molnar) - the free online book
Sequence Models
- Sequence to Sequence Learning with Neural Networks - the Sutskever 2014 seq2seq paper
Python Libraries
Graph Analysis
- networkx - standard graph manipulation and analysis library for Python; not GPU-accelerated but excellent for prototyping and small to medium graphs
Explainable AI
| Library | Focus |
|---|---|
| SHAP | Shapley value explanations, model-agnostic |
| LIME | Local surrogate model explanations |
| ELI5 | Weights, permutation importance |
| InterpretML | EBM + unified dashboard |
| Shapash | Business-facing XAI dashboard |
| explainerdashboard | Interactive Shapley dashboard |
Quantitative Finance and Time Series
- quantstats - portfolio performance analytics (Sharpe, max drawdown, rolling statistics)
- alphalens-reloaded - factor analysis: IC, turnover, quantile returns
- tsfresh - automated time series feature extraction (~800 features per series)
- scalecast - forecasting pipeline with built-in cross-validation
Python-First Web Frameworks
- Streamlit - fastest path to a data app; runs a Python script as a web app
- Pynecone (now Reflex) - React frontend from pure Python
- Anvil - full-stack Python web apps with drag-and-drop UI builder
MLOps
- MLflow - experiment tracking, model registry, model serving
- DVC - data version control; works alongside git for large data and model files
- Feast - feature store for serving features to production models
- Kedro - pipeline framework with built-in DVC and MLflow integration
Best-Of Lists
- ml-tooling/best-of-ml-python - ranked list of Machine Learning Python libraries by GitHub stars and activity
- ml-tooling/best-of-python - broader Python ecosystem rankings
- ml-tooling/ml-workspace - Docker image with JupyterLab plus common Machine Learning stack pre-installed