Smart Grid Big Data Processing Technology Status and Challenges

Current Status and Challenges of Smart Grid Big Data Processing Technology SONG Yaqi ZHOU Guoliang\ZHU Yongli 2(1.School of Control and Computer, North China Electric Power University, Baoding 071003, Hebei Province, China;2 National Emphases of New Energy Power System. It should be pointed out that given the current cloud platform The real-time performance of receiving smart grid monitoring data cannot be guaranteed. Several front-end computers can be set before data access and information integration. They are responsible for receiving alarm information or monitoring data sent from the communication network in real time, and are responsible when the cloud platform cannot respond. Temporary deposit.

Smart Grid Various Applications Production Control Systems Power Management Management Power Camp State Inspection Risk Assessment System Measurement System Estimation System Task Management, Scheduling, and Monitoring Hadoop Cloud Computing System Parallel Data Warehouse Real-Time Database Data Access and Information Integration Smart Grid Big Data Multi-Level Storage systems In addition, the data format in the smart grid is very different from traditional business data and has its own characteristics. For example, in fault recording and condition monitoring of power transmission and transformation equipment, there are more waveform data, and waveform data is essentially different from traditional business data, and has the characteristics of fast data generation. Therefore, it is necessary to study the format of large data storage for smart grids, which will facilitate subsequent data analysis and calculation.

Different types of data in a smart grid environment are heterogeneous and can not be described by existing simple data structures. However, computer algorithms are relatively inefficient in processing complex structural data, but processing homogenous data is very efficient. Therefore, how to organize data into a reasonable homogeneous structure is an important issue in large data storage processing. In addition, there are a large number of unstructured and semi-structured data in the smart grid. How to translate these data into a structured format is a major challenge.

3.2 Real-time data processing technology 3.2.1 Timeliness of data processing For large data, data processing speed is very important. In general, the larger the data size, the longer the analysis process takes.

The traditional data storage scheme is designed for a certain amount of data, and the processing speed may be very fast within its design range, but it cannot meet the requirements of big data. In the future, the smart grid environment requires real-time data processing from the power generation, transmission and transformation links, to the power supply links. The current cloud computing system can provide fast services, but it may be affected by short-term network congestion, even single server failures, and cannot guarantee response time.

Memory-based databases are getting more and more attention. An in-memory database is a database that manipulates data directly in memory. Relative to disk, memory data read and write speeds are orders of magnitude higher. Saving data in memory rather than on disk can greatly improve application performance. The current power system has begun to use an in-memory database to improve real-time performance. For example, in response to last year's power shortages in some parts of China, while the other part of the state showed excess power, SAP introduced a smart meter analysis solution based on HANA memory database, hoping to link the smart grid and power users Integrate and integrate analysis to realize the analysis of the electricity consumption situation in different regions so as to take corresponding preventive measures.

Querying keywords in big data sets is also an important challenge. It is obviously not feasible to find the records that meet the requirements by scanning the entire data set. Even if scanning is accelerated through parallel processing technologies like MapReduce, it is not very reasonable. It is a faster method to save system resources by helping to find the index structure in advance. At present, the design of general index structure only supports some simple data types. Big data requires the establishment of a suitable index structure for complex structure data, which is also a huge challenge. For example, the multidimensional data collected by the Internet of Things has an ever-increasing amount of data. At the same time, there are requirements on the query time limit, and the index structure needs to be continuously updated. The design of the index is very challenging. The following analyzes the challenges brought by data processing of smart grid big data from the aspects of power generation, power transmission and transformation, and power consumption.

3.2.2 Power generation links Power generation companies are characterized by a continuous and highly automated production process that requires real-time monitoring of the entire process, high-speed real-time data processing, long-term historical data storage, and integration and sharing of production information. Studies have shown that if a delay of more than 50 ms is received by the normal operation of the SCADA system, it will lead to an erroneous control strategy; studies have also shown that the SCADA system fails when using the most common TCP/IP protocol in the Internet environment. The main reason is that the TCP protocol controls data flow and corrects data, causing data delays. Future smart grid solutions will require real-time response even in the event of a node failure. Current relational database systems and cloud computing systems are designed to handle permanent, stable data. Relational databases emphasize the integrity and consistency of data maintenance; cloud computing systems emphasize reliability and scalability, but it is difficult to take into account the timing constraints of related data and processing, and can not meet the needs of real-time application of industrial production management.

3.2.3 State monitoring of transmission and transformation links has high requirements on the performance or real-time performance of data storage and processing platforms. Although cloud computing technology can effectively handle big data, it needs to further enhance the existence of massive monitoring data on cloud platforms. Take performance to meet real-time requirements. The previous large-scale power outages were initially caused by environmental factors such as line tripping caused by strong winds. The monitoring scope of the existing SCADA system is limited to the main parameters of the system, and the lack of information on the health status of the important equipment constituting the system makes it difficult for the operating personnel to make correct treatment in the event of an accident. In the future, the smart grid requires a fault self-healing function. The SCADA system must have monitoring data of the entire network and need to include the status data of the power equipment. This imposes higher requirements on the real-time processing of the platform.

The instability of the new green energy generation power causes the fluctuation of the power grid, which forms a great pressure on the entire power grid dispatch. At present, the grid dispatch and control model cannot deal with the fluctuation and unpredictable behavior of such a large number of small power generation systems. The latest research shows that in order to support this situation, it is necessary to create a new type of grid condition monitoring system that can track the grid real-time status in a more granular way. Therefore, the future SCADA system needs real-time processing of several orders of magnitude more monitoring data than the current one.

3.2.4 Power consumption links In the smart grid environment, the household may be equipped with a variety of electrical energy and power monitoring devices to achieve low-cost power usage and match the load of the power grid. For example, an electric water heater may choose to operate at such low power hours during the night; the air conditioner will automatically adjust itself in real time based on parameters such as user comfort, electricity price, and grid load. To a certain extent, we can think that the SCADA system has entered the ordinary home, and the real-time data processing of the power link becomes more and more important.

3.3 Heterogeneous Multi-Data Source Processing Technology 3.3.1 Integration of Heterogeneous Information In the future, the smart grid requires multiple links such as power generation, transmission, power transformation, power distribution, power supply, and dispatch, to achieve comprehensive information collection, smooth transmission, and high efficiency. Deal with and support the highly integrated power flow, information flow, and business flow. Therefore, the primary function is to realize the integration of large-scale multi-source heterogeneous information and to provide a data center with intensive resource allocation for the smart grid. For massive heterogeneous data, how to build a model to standardize its expression, how to achieve data fusion based on the model, as well as effective storage and efficient query is an urgent problem to be solved.

Most information systems of the power grid are based on the needs of the business or the department, and different platforms, application systems, and data formats exist. This results in the decentralization of information and resources, serious heterogeneity, and horizontal sharing. It is difficult to vertically connect between upper and lower levels. For example: There are various information systems in the power system, such as monitoring, energy management, distribution management, and market operations. Most of them are independent of each other, and data and information cannot be shared. The use of cloud platforms to achieve the integration of independent systems can realize the interoperability of information between these isolated and isolated systems.

In addition, the infrastructure of smart grids is large and numerous and distributed in different locations. For example: The State Grid Corporation's information platform establishes a two-tier data center between the company's headquarters and various provincial companies, and implements three-tier applications for corporate headquarters, provincial companies, and prefectural and county companies. How to effectively manage these infrastructures and reduce the operating costs of data centers is a huge challenge.

3.3.2 Efficient management of various types of grid data In the heterogeneous multi-source information fusion and management of smart grids, it is necessary to establish the 61970 information interoperability model. Since the data types in smart grids are more than those involved in IEC 61850, the application of multi-layered knowledge structures and semantics, and the establishment of domain-oriented analysis models and semantic-based service models are optional methods. Using the theory of statistical learning, support vector machine, relevance vector machine and association rules mining, the integration scheme of heterogeneous data fusion and mining and the real-time mining algorithm are studied. Since the deterioration of the equipment state is a process that changes from quantity to quality, the mining of time series data such as oil chromatography accumulated over many years makes more sense. At present, although this type of big data mining has some research results, it is not practical enough.

3.4 Big Data Visualization Analysis Technology In the face of massive amounts of smart grid data, how to present it to users in an intuitive and easy-to-understand manner under a limited screen space is a very challenging task. The visualization method has been proved to be an effective method to solve large-scale data analysis and has been widely used in practice. Large-scale datasets generated by various applications of smart grids, including high-precision, high-resolution data, time-varying data, and multivariate data. A typical data set can reach a TB quantity set. How to quickly and effectively extract useful information from these large and complex data becomes a key technical difficulty in smart grid applications. Visualizes data into high-precision, high-resolution images through a series of sophisticated algorithms, and provides interactive tools that make effective use of the human visual system, allowing real-time changes in data processing and algorithm parameters, viewing, quantifying, and quantifying the data. analysis.

The challenges in this area include the scalability of visualization algorithms, parallel image synthesis algorithms, and the extraction and display of important information.

4 Conclusions The future smart grid will be a panoramic real-time grid relying on big data processing and analysis technologies. Cloud computing provides a storage and analysis platform for this heterogeneous and diverse data. After the platform is running for a period of time, it will inevitably generate big data. The cloud platform and big data analysis will provide support for state-based overhaul of power equipment, self-healing power grids, and isolation of information systems, and become an important candidate for low-cost, good systems. With the advantages of scalability (infinite storage capacity), high reliability, and parallel analysis, several systems have been put into operation in the world, but there are still many challenges in terms of real-time performance, data consistency, privacy, and security. , need to find out the corresponding solution. The processing technology of big data is still lacking. It is still to be explored by people.

Kitchen Gadget

Kitchen Cooking Utensils,Unique Kitchen Gadgets,Kitchen Utensils And Gadgets,Fun Kitchen Gadgets

Garwin Enterprise Co.,Ltd , https://www.garwincn.com