Build Notes

You Have the Sales Data. Here Is How to Turn It into a Production Forecast.

By Enblock · Commercial Solutions Architect|18 Mar 2026

The commercial case for demand forecasting is straightforward: perishable products with production lead times require production decisions before demand is confirmed, and every systematic error in that decision is a margin cost. The previous article in this series covers that argument in full.

We built a working prototype to demonstrate how this pipeline operates end to end. This article covers the build: how raw sales data moves from a CSV export through a validation and storage layer, into a trained forecasting model, and out as an actionable production plan with waste risk quantified in HKD.

The Architecture

The pipeline has three stages.

Pipeline: from sales data to production forecast

Ingest — Historical transaction data is uploaded as a CSV export, validated, and stored as individual transaction records with a full audit trail.

Train — A forecasting model is trained per SKU and per location using up to two years of historical data. Hong Kong public holidays are loaded as an external signal. The model learns the weekly cycle, annual seasonality, and holiday effects.

Forecast — A 14-day forecast is produced for each SKU and location, with a confidence range, a production trigger time, and overproduction risk expressed in HKD.

Each stage is independently callable. Ingest runs as new data arrives. Training runs on a schedule or on demand. Forecast queries run at any point after training is complete.

Watch the prototype in action: Demand Forecasting Demo

1. Ingest: From CSV to Clean Transaction Data

The input is a standard CSV export from a POS or ERP system: transaction ID, timestamp, SKU, location, quantity sold, and total value in HKD. Most POS systems produce this format directly.

Before any row reaches storage, it passes through a validation layer. Missing fields, negative quantities, invalid timestamps, and duplicate transactions are caught and logged. Every import produces an audit record showing exactly how many rows were accepted, rejected, and skipped as duplicates.

The validation step exists because POS exports are not clean. Column names vary between systems, encoding issues appear, and partial rows occur when exports are interrupted. The pipeline normalises these inconsistencies on ingest rather than letting them propagate into the model.

SKUs and locations are created automatically on first appearance. The system does not require a pre-configured product catalogue before ingestion begins.

2. Train: Prophet with External Signals

The forecasting model is Prophet, developed by Meta's data science team and designed specifically for business time series with strong weekly and annual seasonality.

Before training, Hong Kong public holidays are loaded as an external signal for every date in the training window and the 14-day forecast period. This ensures the model treats holidays as a known input, not as unexplained variation.

For each SKU, the system aggregates transactions to daily totals and trains a model with two key configuration decisions:

Multiplicative seasonality — the model learns proportional effects (a 40% uplift on Saturdays, a 25% uplift during CNY week) rather than fixed quantities. This is the correct configuration for F&B demand, where seasonal variation scales with the underlying trend.
80% prediction interval — the model outputs a range rather than a single number, capturing the inherent uncertainty in any forward-looking forecast.

A minimum of 30 days of sales history is required before a model will train. SKUs below this threshold are skipped.

The choice of Prophet over more complex models is deliberate. LSTM neural networks and gradient boosting ensembles can outperform Prophet on large datasets with complex non-linear patterns. For an F&B manufacturer with 12 to 24 months of POS data, the additional complexity does not improve the outcome enough to justify the tradeoff. The right tool is the one the operations team can interpret and act on.

3. Forecast: Three Outputs That Change the Decision

After training, the forecast layer produces three outputs for each SKU and date.

A confidence range, not a single number. The model returns a conservative estimate (P10), a median estimate (P50), and an aggressive estimate (P90). The production team selects a risk tolerance. Choosing between P10 and P90 is a commercial decision: how much does a stockout cost versus how much does overproduction cost? The model provides the range. The business decides where within it to produce.

Production timing derived from lead time. Each product carries a lead time in hours. For a product requiring 48 hours of fermentation and 2 hours of baking, the system calculates two trigger datetimes from the forecast date and returns them as labelled instructions: start dough at 06:00 Thursday for Saturday's forecast. The forecast becomes a production instruction, not just a number to interpret.

Waste risk in HKD. The gap between the median forecast and the conservative estimate represents the overproduction exposure if demand comes in at the lower bound. Expressed in HKD rather than units, this gives the finance team and the operations team a shared number. Units are an operational metric. HKD is a commercial one.

4. Explainability

After training, the system extracts a decomposition of the model's components: trend direction, best and quietest day of week, and the direction of the holiday effect. These are surfaced as a plain-language summary: "Demand is growing steadily. Production peaks on Saturdays, quietest on Tuesdays. Public holidays increase demand."

A model that explains itself in operational terms gets used. A black box does not.

What Comes Next

The current build is Phase 1: CSV ingestion, Prophet modelling, and 14-day forecast output with production timing and waste risk.

The same data foundation supports further capability as the operation matures. Live POS integrations can replace manual CSV uploads. Bill of Materials integration opens ingredient-level procurement planning. Weather feeds from the Hong Kong Observatory introduce typhoon and temperature signals that shift perishable demand in predictable ways.

Each of these builds on the same ingestion schema and forecast output format. The foundation does not need to change as the system grows.

To understand how this pipeline applies to your operation, contact Enblock at info@enblock.net.

Found this useful? Share it with your network.

ShareLinkedIn X

Build Notes

You Have the Sales Data. Here Is How to Turn It into a Production Forecast.

By Enblock · Commercial Solutions Architect|18 Mar 2026

ShareLinkedIn X

The Architecture

The pipeline has three stages.

Pipeline: from sales data to production forecast

Ingest — Historical transaction data is uploaded as a CSV export, validated, and stored as individual transaction records with a full audit trail.

Forecast — A 14-day forecast is produced for each SKU and location, with a confidence range, a production trigger time, and overproduction risk expressed in HKD.

Each stage is independently callable. Ingest runs as new data arrives. Training runs on a schedule or on demand. Forecast queries run at any point after training is complete.

Watch the prototype in action: Demand Forecasting Demo

1. Ingest: From CSV to Clean Transaction Data

The input is a standard CSV export from a POS or ERP system: transaction ID, timestamp, SKU, location, quantity sold, and total value in HKD. Most POS systems produce this format directly.

SKUs and locations are created automatically on first appearance. The system does not require a pre-configured product catalogue before ingestion begins.

2. Train: Prophet with External Signals

The forecasting model is Prophet, developed by Meta's data science team and designed specifically for business time series with strong weekly and annual seasonality.

For each SKU, the system aggregates transactions to daily totals and trains a model with two key configuration decisions:

Multiplicative seasonality — the model learns proportional effects (a 40% uplift on Saturdays, a 25% uplift during CNY week) rather than fixed quantities. This is the correct configuration for F&B demand, where seasonal variation scales with the underlying trend.
80% prediction interval — the model outputs a range rather than a single number, capturing the inherent uncertainty in any forward-looking forecast.

A minimum of 30 days of sales history is required before a model will train. SKUs below this threshold are skipped.

3. Forecast: Three Outputs That Change the Decision

After training, the forecast layer produces three outputs for each SKU and date.

4. Explainability

A model that explains itself in operational terms gets used. A black box does not.

What Comes Next

The current build is Phase 1: CSV ingestion, Prophet modelling, and 14-day forecast output with production timing and waste risk.

Each of these builds on the same ingestion schema and forecast output format. The foundation does not need to change as the system grows.

To understand how this pipeline applies to your operation, contact Enblock at info@enblock.net.

Found this useful? Share it with your network.

ShareLinkedIn X