The comprehensive research facility for fusion technology magnet performance research platform (MPRP) is a large-scale experimental platform established for advanced superconducting magnet experiments. The retrieval speed of MPRP historical data is slow due to massive storage.
The study aims to develop a MPRP data archiving system (MPDAS) and increase its retrieval speed.
First of all, the experimental physics and industrial control system (EPICS) data archiving plug-in was designed for MPDAS. Both MongoDB Sharding and Replica Set mechanism were employed to build a highly scalable data storage architecture. Then, the core ideas of three traditional cache replacement algorithms, LRU (least recently used), LFU (least frequently used) and FIFO (first in first out) were drawn by MPDAS to establish a data temperature model based on Newton's law of cooling. A multi-dimensional feature data partitioning algorithm was implemented to integrate access time, access frequency and storage order, hence the hot and cold historical data were identified to realize data tiered storage. Finally, the retrieval speed of MPDAS was improved by preferentially accessing Redis when querying historical data, and selecting different retrieval strategies based on hit results and data integrity.
The system test results show that the functional characteristics of MPDAS meet the design requirements. Compared with FIFO, LRU, and LFU, the Redis hit rate of the MPDAS when the hot database stores 1% of the historical data is increased by 38.05%, 26.91%, and 11.06% respectively.
By increasing the hit rate of hot data, the average response time of data retrieval can be directly reduced. The retrieval response speed of MPDAS is effectively improved by quantifying the heat of historical data and dividing the heat and cold.