Back
LightGBMMLflowEvidently AIGitHub ActionsAWS S3Next.jsTailwind CSSPyArrowPythonDockerVercel

CitiBike Demand Forecaster:Recursive ML & MLOps Pipeline

End-to-end ML system for 24-hour City Bike demand forecasting using LightGBM with a recursive multi-step prediction engine, automated Champion/Challenger MLOps pipeline, and a full-stack Next.js dashboard.

Core Impact

Engineered a high-precision recursive forecasting engine with automated drift detection and Champion/Challenger model promotion.

CitiBike Demand Forecaster: Recursive ML & MLOps Pipeline

Architecture Breakdown

01

Architected production MLOps system on GitHub Actions with 3 independent scheduled pipelines (feature engineering, training, hourly inference) forecasting NYC Citi Bike demand across top-3 stations — zero manual ops, runs continuously in CI.

02

Engineered recursive bridge algorithm to overcome ~20-day Citi Bike publication lag — walks 480+ hourly steps via 28 autoregressive lag features, achieving MAE of 2.94 trips/hour on 24h LightGBM demand forecasts.

03

Built Champion/Challenger model registry on Hopsworks + MLflow (DagsHub) with automated promotion gating (challenger promoted only on strict MAE improvement) — full experiment lineage, zero manual deployments.

04

Eliminated Python backend by building zero-server Next.js frontend — API routes parse S3 Parquet directly via hyparquet, cutting infrastructure cost to $0 beyond Vercel and S3.

05

Implemented monthly drift monitoring with Evidently AI — auto-generates train vs. holdout distribution reports each retraining cycle, surfacing feature and target drift without manual inspection.

Systems Analysis Concluded

© 2026Marian Glen Louis

Engineered with Next.js, Tailwind v4 & Framer Motion

Press / for terminal