Case Studies Crop Yield Forecasting for Hydroponics Systems
ml Agricultural Technology · Applied Project

Crop Yield Forecasting for Hydroponics Systems

Built an end-to-end ML solution for a hydroponics client to predict crop yields and demand, using environmental sensor data, growth stage features, and historical sales to support crop planning and resource allocation decisions.

Problem

No data-driven approach to crop yield prediction — planning relied on farmer intuition, leading to demand mismatches, resource waste, and inconsistent production.

Outcome

Working ML pipeline integrating environmental, biological, and sales data for crop yield prediction with decision-support outputs for crop planning.

Overview

Hydroponics farming is a data-rich environment — environmental sensors track temperature, humidity, nutrient levels, and light exposure continuously. But most hydroponics operations weren’t using this data systematically. The project was to build an ML pipeline that could predict crop yields from environmental and operational data, and connect those predictions to demand forecasting for better crop planning.

The Problem

Traditional farming relies heavily on experience and intuition. In hydroponics — where environmental conditions can be precisely controlled — there’s an opportunity to be much more systematic. But without a data-driven model, the operation couldn’t answer basic questions: which crops should we grow more of next cycle? How does nutrient mix affect final yield? When should we plant to meet forecasted demand?

Data & Inputs

Approach

Three connected models:

Yield prediction: Decision trees, random forests, and a shallow neural network to predict final yield weight from environmental features and growth stage data. Feature importance analysis to identify the most impactful controllable variables.

Demand forecasting: Time-series model on historical sales data by crop type — capturing weekly seasonality and trend.

Planning integration: Combined yield predictions with demand forecasts to generate planting recommendations — how many plants of each crop type to start each week to meet forecasted demand.

Results & Impact

Technical Detail

Data characteristics: environmental sensors logged at hourly resolution across multiple growing cycles. Key challenge: sensors in a hydroponics system are highly correlated — temperature, humidity, and CO₂ all co-vary with the HVAC system, creating multicollinearity that standard linear models amplify rather than handle gracefully.

Feature engineering:

Model comparison:

Feature importance findings: nitrogen concentration and cumulative light exposure (daily light integral, DLI) were the two highest-impact controllable variables. Temperature contributed meaningfully but showed diminishing returns beyond a crop-specific comfort range. This finding directly shaped operational recommendations — the operation prioritized automated nutrient dosing and supplemental lighting as the highest-ROI improvements.

Planning integration: the yield prediction connected to a time-series demand forecast (exponential smoothing on historical weekly sales by crop type). The output was a weekly recommendation matrix — crop type × recommended planting quantity — to meet 4-week-ahead demand within the predicted yield range. This moved crop cycle planning from intuition-driven to data-driven.

Stack

Python, Scikit-learn, XGBoost, TensorFlow, Pandas, NumPy, Matplotlib

Stack

Python Scikit-learn XGBoost TensorFlow Pandas Matplotlib
forecasting ml agriculture decision-support

Lets collaborate!

Whether you need a quantitative researcher, an machine learning systems builder, or a technical advisor — I'm available for select consulting engagements.

Get in Touch →