Energy Sources Part A-recovery Utilization and Environmental Effects, Vol.40, No.7, 862-872, 2018
Time-length calibration method of large data mining of photovoltaic power plants
When analyzing photovoltaic plant's output characteristics, the time interval and the time length of data need to be considered. At present, there is no exact theoretical basis of the time length of data. In the case of a certain time interval, the selection of the time length of data is of great significance for the analysis of extracting photovoltaic characteristics. This paper takes a feature mining of fluctuation degree and generated energy according to cluster analysis. On the basis of this, a definition of daily performance coefficient is given to characterize data characteristics. The optimum sample size estimation of daily performance coefficient based on principles of probability and statistics is taken to obtain the time length of data under different permissible errors. As a criterion that best meets the objective situation for selecting the statistical properties of the random variable, maximum entropy principle from information entropy theory provides the time length of data-determining method with no other constraints. Based on information entropy theory, taking the issue of storage capacity configuration in Photovoltaic (PV)-storage system as an example, this paper studies the relation between storage capacity demand and the time length of data and the relation between information entropy and the time length of data. In this way, the time length of data for PV-storage system's operating characteristic analysis is determined.
Keywords:Clustering analysis;data time length;information entropy theory;large data;optimal sample capacity estimation