Interactive
Predict GHG Emissions
Enter building characteristics to get GHG emission estimates from both the IPW XGBoost (point estimate) and the IPW Hierarchical Bayesian model (posterior mean + 89% CrI). Upstream characteristics only — no energy consumption data required.
The tool takes five upstream building characteristics — property type, gross floor area, year built, number of buildings, and reporting year — and queries two independently trained models simultaneously. No energy consumption data is required; the models predict GHG directly from physical and temporal attributes, using only information a building manager would know before seeing any utility bills.
IPW XGBoost — point estimate
The inputs are log-transformed and passed through the trained gradient-boosted tree ensemble. XGBoost was trained with IPW sample weights, so each tree split was influenced proportionally to how much each observation down-weights the reporting selection bias. The output is a single point estimate: the model's best guess at total GHG emissions (metric tonnes CO₂e/yr) for the building as described. No uncertainty interval is produced — XGBoost is a deterministic function of the inputs given the trained trees.
IPW HBM — posterior mean + credible interval
The Bayesian model evaluates the posterior predictive distribution at the given inputs. It uses the full trained posterior over all parameters — βA, βT, βY, βB, the property-type intercept αj, and the scale parameters σα — to generate a distribution of plausible emission values. The reported mean is the posterior expected value; the 89% CrI is the interval within which 89% of the posterior probability mass falls. A narrow interval means the model is confident; a wide one means the building profile is unusual or sits in a data-sparse region of the input space.
What the inputs do
Floor area and number of buildings determine scale. Property type selects the partial-pooled intercept βY (vintage efficiency). βT (temporal trend). αj.
When the two models agree
For mid-sized buildings of common types (offices, schools, multifamily), XGBoost and the Bayesian mean are typically within 5–10%. Agreement is a sign the building profile is well-represented in the training data.
When they diverge
At the extremes of the floor area distribution or for intensive property types (hospitals, data centres), XGBoost captures non-linear interactions that the log-linear Bayesian form misses. A divergence flag appears when the gap exceeds 10%.
Enter building details and click Predict.