Chicago Energy Benchmarking · 2014–2023

GHG Emissions Explorer

Chicago requires all large buildings over 50,000 sq ft to report their energy use each year. This tool uses 10 years of that data — corrected for voluntary reporting bias — to estimate greenhouse gas emissions for any building and explore what drives them.

Building-year records

28,329

2014–2023

Unique buildings

3,852

108 ZIP codes

Credible interval

89%

Bayesian uncertainty per prediction

Property types

partial pooling

Tool 1

Predict GHG Emissions

Enter a building's physical characteristics and get estimated annual greenhouse gas emissions. Two models run simultaneously: a gradient-boosted ensemble for a point estimate and a Bayesian model for a mean with uncertainty bounds.

›No energy bills needed — prediction uses only building attributes
›Covers 63 property types: offices, schools, hospitals, warehouses, and more
›Bayesian estimate includes an 89% credible interval reflecting model uncertainty

Try Predict →

Tool 2

Counterfactual Scenarios

Ask "what if?" questions about a building: what would its emissions be if it were built 20 years later, had 20% less floor space, or were measured 10 years from now? Each scenario changes one variable at a time, holding everything else constant.

›Vintage shifts — how much does the construction year affect emissions?
›Size changes — what is the impact of expanding or shrinking floor area?
›Time projections — where is this building heading by 2028 or 2033?

Try Counterfactuals →

What the models are built on

Both tools draw on the same rigorous statistical analysis: selection bias corrected with inverse probability weighting, causal structure encoded in a DAG, and model choice validated with group-aware cross-validation. Expand Technical Report in the sidebar to read the full methodology.

Based on a working paper

This web application is built upon the ideas and analyses of the author's working paper — a preprint that is still being refined. Should you be interested in the methodology presented here — the compliance-gated dataset class, the Tested-DGP pipeline, and the causal estimation — please refer to the original paper for full detail. It is a work in progress, and comments and suggestions are very welcome.

Read the original paper on Zenodo ↗

DOI: 10.5281/zenodo.20686697 (all versions — resolves to the latest)