CitiBike Demand Forecaster:Recursive ML & MLOps Pipeline
End-to-end ML system for 24-hour City Bike demand forecasting using LightGBM with a recursive multi-step prediction engine, automated Champion/Challenger MLOps pipeline, and a full-stack Next.js dashboard.
Core Impact
“Engineered a high-precision recursive forecasting engine with automated drift detection and Champion/Challenger model promotion.”

Architecture Breakdown
Architected production MLOps system on GitHub Actions with 3 independent scheduled pipelines (feature engineering, training, hourly inference) forecasting NYC Citi Bike demand across top-3 stations — zero manual ops, runs continuously in CI.
Engineered recursive bridge algorithm to overcome ~20-day Citi Bike publication lag — walks 480+ hourly steps via 28 autoregressive lag features, achieving MAE of 2.94 trips/hour on 24h LightGBM demand forecasts.
Built Champion/Challenger model registry on Hopsworks + MLflow (DagsHub) with automated promotion gating (challenger promoted only on strict MAE improvement) — full experiment lineage, zero manual deployments.
Eliminated Python backend by building zero-server Next.js frontend — API routes parse S3 Parquet directly via hyparquet, cutting infrastructure cost to $0 beyond Vercel and S3.
Implemented monthly drift monitoring with Evidently AI — auto-generates train vs. holdout distribution reports each retraining cycle, surfacing feature and target drift without manual inspection.
Systems Analysis Concluded