Skip to the content.

Residential HVAC Performance Baseline - Dataset Documentation

Version: 1.2.2
Data Period: 2021-12 through 2025-12
Last Updated: January 12, 2026


Overview

This directory contains raw operational data for the residential HVAC performance baseline study. All data has been cleaned, validated, and formatted for analysis reproducibility. Personal identifying information has been removed while preserving analytical utility.

Property Context:


Dataset Files

1. monthly_hvac_runtime.csv

Description: HVAC equipment runtime hours extracted from Honeywell Lyric T6 Pro thermostat telemetry

Period: January 2025 - December 2025 (12 months)
Rows: 24
Source: Resideo/Honeywell Lyric app export
Frequency: Monthly aggregation

Columns: | Column | Type | Unit | Description | |——–|——|——|————-| | Month | datetime | YYYY-MM-01 | First day of month | | Cooling_Hours | integer | hours | AC compressor runtime during month | | Heating_Hours | integer | hours | Furnace burner runtime during month |

Notes:

Data Quality:


2. daily_temperature.csv

Description: Daily temperature readings from thermostat sensors (indoor and outdoor)

Period: January 1, 2025 - December 31, 2025 (365 days)
Rows: 364
Source: Resideo/Honeywell Lyric app export
Frequency: Daily aggregation (high/low/average)

Columns: | Column | Type | Unit | Description | |——–|——|——|————-| | Date | datetime | YYYY-MM-DD | Calendar date | | Outdoor_High | float | °F | Daily maximum outdoor temperature (Hartford KBDL proxy) | | Outdoor_Low | float | °F | Daily minimum outdoor temperature (Hartford KBDL proxy) | | Outdoor_Mean_F | float | °F | Daily mean outdoor temperature | | HDD65 | float | °F-days | Heating degree days (base 65°F) | | Indoor_1st_Floor | float | hours | Daily HVAC runtime hours (heating + cooling) for 1st floor zone | | Indoor_2nd_Floor | float | hours | Daily HVAC runtime hours (heating + cooling) for 2nd floor zone |

Notes:

Data Quality:


3. monthly_dhw_navien.csv

Description: Domestic hot water (DHW) gas consumption and operating metrics from Navien tankless water heater

Period: October 2024 - December 2025 (15 months)
Rows: 15
Source: Navien NaviLink independent gas meter
Frequency: Monthly reading

Columns: | Column | Type | Unit | Description | |——–|——|——|————-| | Month | datetime | YYYY-MM-01 | First day of month | | Gas_CCF | float | CCF | Natural gas consumed by DHW system | | DHW_Operating_Hours | integer | hours | DHW burner runtime (heating water) | | Recirculation_Hours | integer | hours | Recirculation pump runtime |

Notes:

Data Quality:


4. monthly_electricity_eversource.csv

Description: Whole-house electricity consumption from utility billing records

Period: December 2021 - December 2025 (49 billing periods)
Rows: 49
Source: Eversource Energy utility bills
Frequency: Monthly billing cycles (28-34 days per period)

Columns: | Column | Type | Unit | Description | |——–|——|——|————-| | Read_Date | datetime | YYYY-MM-DD | Utility meter read date (end of billing period) | | Usage_kWh | integer | kWh | Electricity consumed during billing period | | Days | integer | days | Number of days in billing period | | Usage_Per_Day | float | kWh/day | Average daily consumption (calculated) | | Charge | float | $ | Total charges for billing period | | Cost_Per_kWh | float | $/kWh | Average cost per kWh (calculated) |

Notes:

Data Quality:

Load Breakdown (2025 calendar year via billing-aligned analysis):


5. monthly_gas_scg.csv

Description: Whole-house natural gas consumption from utility billing records

Period: December 2021 - December 2025 (49 billing periods)
Rows: 49
Source: Southern Connecticut Gas Company utility bills
Frequency: Monthly billing cycles (28-34 days per period)

Columns: | Column | Type | Unit | Description | |——–|——|——|————-| | Bill_Date | datetime | YYYY-MM-DD | Utility bill date (end of billing period) | | Billing_Days | integer | days | Number of days in billing period | | Gas_CCF | float | CCF | Natural gas consumed during billing period | | Usage_Per_Day | float | CCF/day | Average daily consumption (calculated) | | Total_Charges | float | $ | Total charges for billing period | | Cost_Per_CCF | float | $/CCF | Average cost per CCF (calculated) |

Notes:

Data Quality:

Load Breakdown (2025 calendar year via billing-aligned analysis):


Data Quality and Validation

Validation Methods

1. Internal Consistency Checks:

2. External Cross-Validation:

3. Statistical Validation:

Known Limitations

  1. Thermostat Data Coverage:
    • 2025 only (earlier years not exported from Lyric app)
    • One missing day in daily temperature data
    • Runtime data does not capture variable blower speeds
  2. Navien DHW Meter:
    • Data availability: October 2024 onward (meter installed mid-2024)
    • 15% variance from billing-aligned calculation (seasonal baseline drift)
    • Recirculation pump operation time may include non-heating runtime
  3. Utility Billing:
    • Asynchronous billing cycles require billing-aligned methodology
    • Electricity rate includes both supply and delivery (not separately itemized)
    • Gas rate varies seasonally with supply contracts

Usage Guidelines

Reproducing Baseline Analysis

All calculations in the baseline report can be reproduced using these datasets:

  1. Heating Intensity (CCF/1k HDD):
    • Use monthly_gas_scg.csv for total gas consumption
    • Subtract DHW baseline (0.533 CCF/day from monthly_dhw_navien.csv)
    • Normalize by HDD from NOAA KBDL weather station
    • Result: 95.5 CCF/1k HDD (2025)
  2. Building UA Calculation:
    • Space heating gas from step 1
    • Convert to delivered heat (× 100k BTU/CCF × 0.96 AFUE)
    • Add fireplace contribution (3.6 MMBTU estimated)
    • Divide by (24 hr/day × HDD at 59°F balance point)
    • Result: 480 BTU/hr-°F
  3. Electricity Load Decomposition:
    • Use billing-aligned methodology (see METHODOLOGY.md)
    • Cooling load isolated via regression against cooling degree days
    • HVAC blower: 831 hours × 0.21 kW = 84 kWh (from monthly_hvac_runtime.csv)
    • Baseload: Average of low-load months (April, May, October)

Citation

If you use this dataset in your research, please cite:

bibtex @dataset{collis2026_hvac_data, author = {Collis, William K.}, title = {Residential HVAC Performance Baseline Dataset}, year = {2026}, publisher = {GitHub}, howpublished = {\url{https://github.com/wkcollis1-eng/Residential-HVAC-Performance-Baseline-/tree/main/data}}, version = {1.2.2} } text


Privacy and Data Protection

Anonymization:

Data Sharing:


Contact

Questions about the dataset:

Reporting data quality issues:


Version History

v1.2.2 (January 2026):

Future Additions:


Last Updated: January 12, 2026
Data Quality Review: Validated against baseline report v1.2.1
License: MIT (see repository LICENSE file)