API Docs
This page documents the modules used to create MVBEP. The current state of the documentation includes only MVBEP`. The remaining modules (i.e. initializer,
Transformer, Developer, Interpreter, and Writer) are not yet documented.
mvbep.mvbep module
Measurement and Verification Building Energy Prediction (MVBEP) is a class that encompasses different
modules for reading and validating input data to transforming such data and using them to develop regression
models for savings estimations in the post-retrofit period.
The class is fitted by using fit_training() which takes in the required input data. Followingly, an
initialization summary is produced to check the data sufficiency requirements or the need for any actions
to fix the input data. If the data met the requirements to build a model, the function develop_mvbep()
is used to transform the data, train, and evaluate regression models. generate_development_summary() function
can be used to see the summary of the development process. Finally, savings are estimated passing using post-retrofit
data to predict_energy_consumption() function. The current state of the documentation covers only MVBEP class.
Future additions to the project includes writing the documentation for the remaining modules (i.e. initializer,
Transformer, Developer, Interpreter, and Writer).
Please check the provided Notebooks for the package demonstration.
- class mvbep.mvbep.MVBEP(mvbep_state_path: Optional[str] = None)[source]
Bases:
objectMVBEP class to perform all steps of building an energy consumption baseline.
The class incorporates the 4 required modules for building a baseline starting from initialization to savings quantification.
- Parameters
mvbep_state_path (str, default 'None') – The file path for a saved MVBEP state in case the baseline creation process stopped before the final step and saved by
save_mvbep_state().
Example
In case a object of MVBEP was saved by using
save_mvbep_state(), it can be loaded like>>> mvbep_boulder = MVBEP(mvbep_state_path = 'mvbep_states/office-boulder_mvbep-state')
In case there was no object saved before, an instance of MVBEP is created by
>>> mvbep_boulder = MVBEP()
- fit_training(data: DataFrame, frequency: str, country_code: Optional[str] = None, occupancy_schedule: Optional[dict] = None, mismatch_date_threshold: float = 0.3, total_missing=None, max_consec_missing=None, n_days=360)[source]
Fits a MVBEP object with raw data.
This is the first method in developing an energy consumption baseline. The method takes required historical data to prepare them for next processes.
- Parameters
data (pd.DataFrame) –
A dataframe that includes the required data which includes at least
Timestamps in 15-min or hourly intervals
Energy consumption
Outdoor dry-bulb temperature
frequency (str, {'15-min', 'hourly'}) – The timestamps intervals frequency.
country_code (str, default to 'None') – A two-letter
strindicating the country code in which the building resides. The supported codes are listed in holiday package documentationoccupancy_schedule (dict, default to 'None') – A
dictindicating the general occupancy density in the building. [Check the parameter structure ](??)mismatch_date_threshold (float, default to 0.3) – Sets the threshold for values in
timestampcolumn that cannot be converted fromstrtodatetimeobject.total_missing (int, default to 'None') – Sets a threshold for the total number of a feature’s missing observations to meet data sufficiency requirements. The default value is set based on frequency.
max_consec_missing (int, default to 'None') – Sets a threshold for consecutive missing observations in a single feature before the feature is dropped. The default value is set based on frequency.
n_days (int, default to 365) – Sets a threshold for the least number of days in
data.
Example
Example of a building located in Boulder, CO, USA with hourly timestamps. The instance of MVBEP was created with a nmae of
mvbep_boulder.>>> mvbep_boulder.fit_training( ... data = df_boulder_office, ... frequency = 'hourly', ... country_code = 'US' ... )
- generate_initialization_summary(file_name: Optional[str] = None)[source]
Generates summary of the initialization performed after
fit_training().The initialization summary is generated as an HTML file with highlights of the initialization process including plots, descriptive data, and data sufficiency result.
- Parameters
file_name (str, default to 'None') – Sets the name of the HTML initialization summary. In case no name was provided, the resulting name will be
initiation_time+init_sum_.
Example
Writing the initialization summary of
mvbep_boulderafter runningfit_training().>>> mvbep_boulder.generate_initialization_summary(file_name = 'mvbep_summaries/office-boulder_init-summary')
- develop_mvbep(modeling_methods: Optional[dict] = None, test_size: float = 0.2, hyperparameter_tuning: bool = False, ranking_method: str = 'min_cvrmse')[source]
Transforms the cleaned data and develops regression models.
Takes the cleaned data after
fit_training()and iterates over the possible transformations while using each transformation to generate regression models using the chosen modeling approaches inmodeling_methods. With each transformation, outputs such as evaluation metrics and models are saved in the MVBEP object’s state (i.e. attributemvbep_state).- Parameters
modeling_methods (dict, default to 'None') –
The chosen modeling approaches to develop the baseline. In case None was passed, the argument is passed by:
>>> default_modeling_methods = { ... 'LR' : True, # TOWT (If the frequency is hourly otherwise it is WLS) ... 'RF' : True, # Random Regression Forest ... 'XGB': True, # Extreme Gradient Boosting ... 'SVR': True, # Support Vector Regressor ... 'SLP': True, # Feed Forward Neural Network ... 'KNN': True # K-Nearest Neighbor ... }
test_size (float, default to 0.2) – Sets the testing set size out of the input data.
hyperparameter_tuning (bool, defalut to
False) –If
True: the hyperparameter tuning process is performed for any model with hyperparameters to be tuned.If
False: No hyperparameter tuning process is performed (except for KNN).
ranking_method (str, {'min_cvrmse', 'min_nmbe'}, default to 'min_cvrmse') –
Sets the ranking method to choose the best model based on the testing set evaluation.
If
min_cvrmse: The best model is selected based on Coefficient of Variation of Root Mean Squared Error (CV(RMSE))If ‘min_nmbe’: The best model is selected based on Normalized Mean Bias Error (NMBE).
Example
Developing
mvbep_boulderafter runningfit_training().>>> mvbep_boulder.develop_mvbep()
- generate_development_summary(file_name: Optional[str] = None)[source]
Generates development summary after using
develop_mvbep().Outputs an HTML file that summarizes the development process after running
develop_mvbep().- Parameters
file_name (str, default to 'None') – Sets the name of the HTML development summary. In case no name was provided, the resulting name will be
initiation_time+dev_sum_.
Example
Writing the initialization summary of mvbep_boulder after running
develop_mvbep().>>> mvbep_boulder.generate_development_summary(file_name = 'mvbep_summaries/office-boulder_dev-summary')
- save_mvbep_state(file_name: Optional[str] = None)[source]
Saves the current progress of the MVBEP object by storing
mvbep_state.- Parameters
file_name (str, default to 'None') – Sets the name of the
Joblibstate file. In case no name was provided, the resulting name will beinitiation_time+mvbep_state.
Example
Saving the state of either an initiated MVBEP by
fit_training()or a developed one bydevelop_mvbep().>>> mvbep.save_state('mvbep_states/office-boulder_mvbep-state')
- predict_energy_consumption(data: DataFrame, generate_summary: bool = False, file_name: Optional[str] = None, mismatch_date_threshold=0.3, total_missing=None, max_consec_missing=None)[source]
Generates savings quantification summary after using
develop_mvbep().Outputs an HTML file that summarizes the quantification process after running
develop_mvbep(). The quantification process requires post-retrofit data that matches the same frequency and features of the data used in initialization when runningfit_training(). Features that was dropped in the initialization process are not required in the post-retrofit data. To see which features passed the initialization process, check the output ofgenerate_initialization_summary().- Parameters
data (pd.DataFrame) – The post-retrofit data.
generate_summary (bool, default to False) –
Either generates a summary in an HTML file or return a
listof baseline energy consumption. In case the passeddatadoes not meet the requirements, an initialization summary is generated regardless of the passed argument ingenerate_summary.If
True: A quantification summary is provided. The function does not return any object.If
False: A list of baseline energy consumption for the provided post-retrofit period is returend.
file_name (str, default to 'None') – Sets the name of the HTML quantification summary. In case no name was provided, the resulting name will be
initiation_time+quant_sum_.mismatch_date_threshold (float, default to 0.3) – Sets the threshold for values in timestamp column that cannot be converted from
strtopd.datetimeobject.total_missing (int, default to 'None') – Sets a threshold for the total number of a feature’s missing observations to meet data sufficiency requirements. The value is set based on frequency.
max_consec_missing (int, default to 'None') – Sets a threshold for consecutive missing observations in a single feature before the feature is dropped. The value is set based on frequency.
Example
Writing the quantification summary of
mvbep_boulder.>>> mvbep_boulder.predict_energy_consumption(data = df_boulder_post_retrofit, ... generate_summary = True, ... file_name='mvbep_summaries/office-boulder_dev-summary')