Skip to main content

Table 3 Effect of data cleaning on the dataset

From: K-means clustering of electricity consumers using time-domain features from smart meter data

Number of rows before processing

21,000,000 rows

Consumption (KWH) rows with missing values

676 rows

After dropping rows with missing values

20,999,324 rows

Outliers detected

485,003 rows

After removing the outliers

20,514,321