Data augmentation method for insulators based on Cycle-GAN

Run Ye; Azzedine Boukerche; Xiao-Song Yu; Cheng Zhang; Bin Yan; Xiao-Jia Zhou

doi:10.1016/j.jnlest.2024.100250

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Abstract

Data augmentation is an important task of using existing data to expand data sets. Using generative countermeasure network technology to realize data augmentation has the advantages of high-quality generated samples, simple training, and fewer restrictions on the number of generated samples. However, in the field of transmission line insulator images, the freely synthesized samples are prone to produce fuzzy backgrounds and disordered samples of the main insulator features. To solve the above problems, this paper uses the cycle generative adversarial network (Cycle-GAN) used for domain conversion in the generation countermeasure network as the initial framework and uses the self-attention mechanism and channel attention mechanism to assist the conversion to realize the mutual conversion of different insulator samples. The attention module with prior knowledge is used to build the generation countermeasure network, and the generative adversarial network (GAN) model with local controllable generation is built to realize the directional generation of insulator belt defect samples. The experimental results show that the samples obtained by this method are improved in a number of quality indicators, and the quality effect of the samples obtained is excellent, which has a reference value for the data expansion of insulator images.

Keywords

Data expansion Deep learning Generate confrontation network Insulator

1 Introduction

An important aspect of modern power inspection is the state detection of equipment in transmission lines. The traditional edge detection technology can not accurately detect the overlapping state of complex insulator images, the insulator detection based on deep learning is more efficient, but deep learning needs a certain number of data sets as the training basis, to make accurate judgments on the status of samples in the actual scene. The data in the field of power grid transmission lines is often in high-risk scenarios, so it is difficult to obtain data and the data amount is low. Data augmentation based on the traditional generative adversarial network can generate nearly real samples by learning from a large number of samples, but the traditional algorithm has too large requirements on data volume, and it is difficult to meet the data augmentation requirements of insulators. The enhancement scheme based on the cycle generative adversarial network (Cycle-GAN) has low requirements on the initial data. Combined with the characteristics of insulator data, it can expand the corresponding types of data through mutual conversion, so it is very suitable for the insulator data enhancement task.

When converting images with different data distributions (such as oil painting and landscape map), the traditional Cycle-GAN can quickly realize the conversion in a few batches of training, because the distribution of the two image data sets is simple, and the conversion process does not need to consider too many target details in the image, but can learn the distribution law from the entire image sample. When applied to insulator data sets, the traditional generator is no longer suitable in order to preserve the complex background information of the insulator.

This paper mainly studies the insulator image augmenting method based on the generative adversarial network (GAN) in the transmission environment. Through the research of existing image sample augmenting methods, it is found that most of the augmenting methods still have many problems when applied to insulator samples, such as poor sample quality and sample feature information contrary to reality. In addition, the instability of GAN and the uncontrollability of the generated images also need to be solved. Therefore, this paper carries out the following work around these issues: An attention mechanism based on the Cycle-GAN structure combined with a channel attention mechanism and self-attention mechanism is designed. The attention monitoring generator is used to process the sample region, preserve greatly the background truth of the sample, and generate images that are consistent with the distribution of insulator samples in the real environment.

2 Related works

2.1 Data augmentation review

The traditional image data generation is generally carried out on a single image, its operations include flipping, cropping, erasing, blurring, adding noise, scaling, affine transformation, histogram equalization, and wavelet transformation. The blending generation is mainly divided into two areas, pixel-level blending and layer blending. The expansion of the learning data distribution has been developed after the birth of the generative model, with the generative adversarial network model as the main development line. For pixel level mixing, Tokozume et al. proposed a method for applying linear mixing between simple sound clip class learning (BC learning) [1] to the image domain. This method has improved the image classification performance. After improving the accuracy of the classification task, Tokozume et al. extended the fusion idea to the field of images. Layer blending is the overlaying of images in the form of slabs and is often used in conjunction with augmentation methods such as random cropping. The random image cropping and patching (RICAP) proposed by Takahashi et al. is layer mixing, which is verified on the Canadian Institute for Advanced Research 10/100 (CIFAR10/100) [2]. In CIFAR-10, the classification was reduced from 3.89% to 2.95%. In CIFAR-100, the classification error rate was reduced from 18.86% to 17.45%.

2.2 Generate adversarial network data augmentation

The augmentation application based on the GAN [3] benefits from the development of GAN. In the early days, Antoniou et al proposed data augmentation GAN (DAGAN) to solve the problem of sparse data sets [4]. By learning the distribution of real data through a small number of real samples, the generator generates fake data matching the distribution and verifies the validity of augmented data on multiple public data sets. Among them, the classification accuracy of the Omniglot data set increased by 13% to 82% [5], and the classification task accuracy of the extended mixed national institute of standards and technology (EMNIST) data set was increased by 2.1% to 76.1% from 74% [6]. Accuracy was improved from 0.5% to 97.4% on the Omniglot dataset and from 1.8% to 61.3% on the EMNIST dataset. In practical applications, Frid-Adar et al. [7] proposed the application of the deep convolutional GAN (DCGAN) to the task of data set expansion of liver computed tomography (CT) images. A small number of real samples were used as the learning basis, a large number of labeled samples were synthesized, and the new data was obtained for verification. The sensitivity and specificity of the test dataset were increased by 7.1% and 4%, respectively. Han et al. [8] generated the tumor image data set of brain CT images on the basis of the progressively growing GAN (PG-GAN) framework, trained and tested a series of lesion CT samples such as cysts and thrombosis, and used the You Only Look Once Version 3 (YOLOv3) [9] target detection algorithm to verify the generated data, resulting in a 3% increase in mean average precision (mAP) index and a 9.9% increase in sensitivity index. In addition, Zhu et al. [10] proposed the use of Cycle-GAN to augment the data of the expression image dataset and convert the training data into samples of other categories of the expression through the generator, thus providing feasibility for the data generation when the dataset of the same category lacks a few samples.

3 Method

3.1 Generator design

The method used in this paper is based on Cycle-GAN [11–13]. The selection of Cycle-GAN takes images as the input, starts from the perspective of the real data, converts according to real characteristics, retains complex background environment, learns the difference of insulator samples in two distributions, and masters the main insulator structural characteristics in the images. Gamut transformation within the structural features of interest is achieved through a generator structure designed by this method, which incorporates two expansion modules used to generate the attention focus area during the conversion process. The input image first passes through two attention channels A and B. The two modules adopt two different attention mechanisms: The channel attention mechanism and the self-attention mechanism. The self-attention mechanism has a good performance in the correlation of global features, and the channel attention mechanism has a good performance in the correlation degree between channels. After two attention networks are connected in parallel to generate their own attention graphs, a convolutional layer with shared parameters is transformed into a black-and-white distribution map containing the attention map. The image overlaps with the original input image in the form of occlusion to preserve the regions of interest learned by the attention network in the attention distribution map. The superimposed graph is used as the input for generation, and after the generator conversion, it is superimposed with the less weighted part of the attention layer learned before, to generate a new sample with the background information of the image kept high, and only the information of the main insulator is transformed. The network structure integrated with the attention module is shown in Fig. 1.

Figure 1.Structure of attention Cycle-GAN.

3.1.1 Fusion attention module

In the attention module, the channel attention mechanism and self-attention mechanism are adopted at the same time. The input of module A and module B uniformly uses the convolution eigenmatrix with dimension $ W\times H\times C $. After passing through two attention networks, a feature matrix $ \tilde {\mathbf{x}}(W{\rm{,}}\;H{\rm{,}}\;C) $ with the same dimension is generated. Then, a series of pooled convolution operations are performed to produce a map of attention distribution. By combining the learning ability of channel attention and self-attention mechanisms, the target position, color block information, structural information, and edge features in the feature map are doubly focused. Finally, the focused feature convolution is used to generate the attention distribution map of the current input. The function of the attention distribution map is to display the current data input and help generate the areas that need to be converted and retained by the conversion module. The structure of the attention module is shown in Fig. 2.

Figure 2.Attention module structure diagram.

In the dual attention module, channel attention will convolve the input feature convolved eigenmatrix $ \mathbf{U}\left(W{\rm{,}}\;H{\rm{,}}\;C\right) $ through a series of convolved pooling operations to obtain a new eigenmatrix $ \tilde{\mathbf{U}}(W{\rm{,}}\;H{\rm{,}}\;C) $. The self-attention mechanism channel is also operated to obtain a new feature matrix $ \tilde{\mathbf{P}}(W{\rm{,}}\;H{\rm{,}}\;C) $. The two matrices have the same dimension, and after weighted operation, the final attention matrix is obtained $ \mathbf{X}(W{\rm{,}}\;H{\rm{,}}\;C) $. After convolution, the X-matrix is converted into the attention distribution map of the gray image, to realize the attention control of the generation process.

3.1.2 Channel attention mechanism

The representative model of the channel attention mechanism is squeeze-and-excitation networks (SENet) [14]. By learning the samples, the model can automatically obtain the importance of different feature channels from the distribution law of the samples, and then enhance the influence of the features with the greater weight according to the learned importance, and at the same time suppress the influence of the features with the less effect in the current training task. Its function is to provide the network with the interdependence between the feature vectors convolution between different channels, so that the neural network segment adding the channel attention mechanism has the function of suppressing the secondary information and enhancing the main information. In the feature dimension, common convolution operations will fuse the features of all channels by default. The attention structure of channels is shown in Fig. 3.

Figure 3.Channel attention network diagram.

In the channel attention module above, $ \mathbf{X}\in {\mathbb{R}}^{{H}'\times {W}'\times {C}'}$ and $ \mathbf{U}\in {\mathbb{R}}^{H\times W\times C} $. The learning set of the convolution kernel is expressed as $ \mathbf{V}=({v}_{1}{{\rm{,}}\;v}_{2}{\rm{,}}\;{v}_{3}{\rm{,}}\;\cdots {\rm{,}}\;{v}_{c}) $, where $ {v}_{c} $ represents the set of all parameters of the cth convolution kernel. The final output is expressed as $ \mathbf{U}=({u}_{1}{\rm{,}}\;{u}_{2}{\rm{,}}\;{u}_{3}{\rm{,}}\;\cdots {\rm{,}}\;{u}_{c}) $, The formulation of $ {u}_{c} $ can be shown as：

$ {u}_{c}={v}_{c}*\mathbf{X}=\sum _{i=1}^{c}{v}_{c}^{s}*{x}^{s} $ (1)

where the * in the formula represents the convolution operation. $ {v}_{c}=({v}_{c}^{1}{\rm{,}}\;{v}_{c}^{2}{\rm{,}}\;{v}_{c}^{3}{\rm{,}}\;\cdots {{\rm{,}}\;v}_{c}^{c}) $, $ \mathbf{X}=({x}^{1}{\rm{,}}\;{x}^{2}{\rm{,}}\;{x}^{3}{\rm{,}}\;\cdots {\rm{,}}\;{x}^{c}) $, and $ {u}_{c}\in {\mathbb{R}}^{H\times W} $. $ {v}_{c}^{s} $ represents a two-dimensional spatial kernel, and $ {x}^{s} $ represents the sth input. After the input is convolved on the channel, it is finally added to the initial input vector. The feature space relations are learned in this way. This approach is a joint representation of the spatial relationship that includes both the channel feature relationship and the convolution kernel learning. After abstracting the mixed relation h, a model can directly learn the channel feature relation in the whole sample. Finally, after transformation, we get the feature we are interested in. The feature can be saved in the form of $ H\times W\times C $, or it can be converted into a viewable way by decoding. Due to the limitation of the size of the convolution kernel, the number of parameters brought by a large convolution kernel will increase square, so the convolution kernel should not be set too large in general, which also leads to the convolution kernel operation always carried out in a local scope. As a result, it is difficult for U to obtain a wider range of information to learn the relationship between channels. The channel attention mechanism will generally combine compression operations to obtain an encoded global feature on the previous channel, and compress the feature information into a channel descriptor through the global average pooling of the convolutional layer. The detailed calculation method is shown in the formula:

$ {\mathbb{Z}}_{c}={F}_{\mathrm{s}\mathrm{q}}\left({u}_{c}\right)=\frac{1}{H\times W}\sum _{i=1}^{H}\sum _{j=1}^{W}{u}_{c}(i{\rm{,}}\;j) {\rm{,}}\; \mathbb{Z}\in {\mathbb{R}}^{ {c}} $ (2)

where F_sq stands for global average pooling, $ {\mathbb{Z}}_{c} $ is obtained by the contraction of $ \mathbf{U} $ through its spatial $ H\times W $ dimension, representing the cth statistic. The two-dimensional feature channel is represented by a real number, which is the representation of the global receptive field. The number of input channels and the dimensions of output features match each other so that the feature channels are distributed globally. The input is compressed first, the dimension is converted to 1×1×C, and the features of the H×W dimension are compressed into one dimension by convolution. After the feature becomes one-dimensional, the information contained in this dimension is the global information of the previous H×W dimension feature, and the perception area is greatly expanded.

After data features are compressed, it is still necessary to use the gathered information to stimulate the subsequent learning process, so that the channel dependencies can be fully fitted. After the features of one-dimensional C channels are obtained by the compression method, the weight of the influence degree of each channel is learned through a fully connected layer, and the weight of each channel is obtained, and then applied to the corresponding path of input features, as shown in the formula:

$ s={F}_{\mathrm{e}\mathrm{x}}\left(\mathbb{Z}{\rm{,}}\;W\right)=\sigma \left(g\left(\mathbb{Z}{\rm{,}}\;W\right)\right)=\sigma \left({W}_{2}\mathrm{R}\mathrm{e}\mathrm{L}\mathrm{U}\left({W}_{1}{\rm{,}}\;\mathbb{Z}\right)\right) $ (3)

where σ represents the sigmoid function. F_ex represents a weight value for each feature channel. ReLU() is a commonly used activation function in artificial neural networks. $W_1{\rm{,}}\; W_2 \in \mathbb{R}^{\tfrac{c}{r} \times c} $, the two represent the weight matrix parameters of the two front connection layers respectively, and r represents the scaling parameter, the value of which is 8 in this paper. $\dfrac{c}{r} $ is the number of hidden nodes in the middle layer. The purpose of r is to reduce the number of channels in the model, and a too large value will increase the calculation amount. The weight parameter of the excitation function can be regarded as the proportion of the importance degree of the corresponding channel.

3.1.3 Self-attention mechanism

In order to achieve the conversion balance between the main insulator feature region and the background region, a self-attention mechanism is introduced. The self-attention implementation architecture is shown in Fig. 2. Input the same vector containing image information into two different feature converters, such as f, g, and h in Fig. 4. The area of attention is calculated by f(x), g(x), and h(x). The three conversion modules are all 1×1 convolution, the difference is that the number of channels of the three paths is inconsistent, and in transformer, they are respectively called query, key, and value. Since the convolution operation sets parameters such as step size and convolution kernel size, the index of three convolutions can reduce the number of image channels. In the past, the activation function was usually added at the end of this process, so it introduces more nonlinear transformations and enhances the expression ability of the neural network for the nonlinear distribution.

Figure 4.Self-attention network diagram.

3.1.4 Attention loss function

In the structure of Fig. 1, the attention module is represented by A, and the input is represented by x. When x is the input of A, the output is A(x) with the same size as the original input, and the dimension is only 1, and the value is between 0 and 1. The attention module generates an attention layer A(x) by assigning more weight to areas of interest based on the difference in the distribution of the given data while reducing the weight of areas of interest. In the other branch, the generator G converts the region of interest based on the input image x and the temporary sample generated by the overlay of the attention layer A(x) as the input, and the final image is generated as

$ G\left(x\right)={A}_{X}\left(x\right) \odot {G}_{Y}\left(x\right)+(1-{A}_{X}(x\left)\right)\odot x $ (4)

where ⨀ represents the multiplication operator element by element, and the mapping F representation is introduced on the other side of the model, so that when the transformed image returns to the original space, its distribution still belongs to the original domain, as shown in the formula:

$ F\left(G\right(x\left)\right)\approx x {\mathrm{.}} $ (5)

The expression of the F map is as follows:

$ F\left(x\right)={A}_{Y}\left(y\right)\odot {G}_{X}\left(y\right)+(1-{A}_{Y}(y\left)\right)\odot y {\mathrm{.}} $ (6)

In other conversion networks, the generator G converts the entire image to the target domain, and then the generator F restores the converted sample to the source domain. As a result, the background of the generated image will appear very unreal, which is very different from the background of the original image. Moreover, there is no recognition at all, and the cyclic consistency loss is difficult to reach 0. In the method proposed in this paper, the generator is converted under the constraint of the attention image, the input image has the distinction between the concentration area and the non-concentration area, and the non-concentration area is retained, so that the cyclic consistency loss of the output sample in the background part is directly 0, and only the attention area is focused.

The training process is similar to the cyclic consistency network. The input into the attention module predicts the attention distribution $ A_X\left(x\right) $ of x, which should be the same as the predicted attention distribution $ {A}_{Y}\left(G\right(x\left)\right) $ when G(x) is the input. For example, the input glass insulator image is transformed into a ceramic insulator through the network. The region of the glass insulator in the intermediate process should be the same in a cyclic way when the ceramic insulator is used as the attention extraction in the next stage, and the transformed image should also have little difference, as shown below:

$ {A}_{X}\left(x\right)\approx {A}_{Y}\left(G\right(x\left)\right){\mathrm{.}} $ (7)

Similarly, for data Y from the y domain, the mapping of the attention module should also satisfy cyclic consistency, as shown in the following formula:

$ {A}_{Y}\left(y\right)\approx {A}_{X}\left(F\right(y\left)\right){\mathrm{.}} $ (8)

In order to achieve the description of the above formula, this paper sets the cyclic consistency loss function as

$ {\mathcal{L}}_{{A}_{\mathrm{c}\mathrm{y}\mathrm{c}}}\left({A}_{X}{\mathrm{,}}\;{A}_{Y}\right)={\mathbb{E}}_{x\in X}\left[{\left\|{A}_{X}\left(x\right)-{A}_{Y}\left(G\left(x\right)\right)\right\|}_{1}\right]+{\mathbb{E}}_{ {y}\in {Y}}\left[{\left\|{A}_{Y}\left(y\right)-{A}_{X}\left(F\left(y\right)\right)\right\|}_{1}\right] $ (9)

where $ {\mathbb{E}}_{x\in X} $ represents the entropy from real data passing through the discriminator.

On the basis of cyclic consistency, the attention network is also expected to focus on small areas related to the main features rather than the entire image, so as to avoid the failure of the attention module. Therefore, the sparse loss is introduced as shown

$ {\mathcal{L}}_{\mathrm{s}\mathrm{p}\mathrm{a}\mathrm{r}\mathrm{s}\mathrm{e}}\left({A}_{X}{\rm{,}}\;{A}_{Y}\right)={\mathbb{E}}_{x\in X}\left[{\left\|{A}_{X}\left(x\right)\right\|}_{1}\right]+{\mathbb{E}}_{y\in Y}\left[{\left\|{A}_{Y}\left(y\right)\right\|}_{1}\right]{\mathrm{.}} $ (10)

3.2 Defect insulator generation network based on transfer learning

The parameters in the network based on intact insulator training also have certain applicability to defective insulators. The distribution in defective samples and non-defective samples has a common subset to a large extent, and the number of defective insulator samples is very scarce, so the difference of the data distribution in the two domains is inevitable. Therefore, based on the difference between the two domains and the scarcity of defect samples, a feature regeneration module and a transfer discriminator compensation module are designed to carry out transfer learning on the parameters of the trained insulator conversion network so that the network cannot only transform different types of samples, but also can generate local defects according to the attention distribution diagram of the insulator during the conversion process. To solve the problem of lack of defective insulator samples, the transfer network structure combined with prior knowledge is shown in the Fig. 5.

Figure 5.Transfer training GAN structure diagram

In Fig. 5, the compensation module is used to add random local noise to the attention distribution map generated by the attention module to disturb the area of interest of the input sample and disrupt the local features of the input image. The disrupted local feature part is regenerated to meet the feature distribution of the defect position of the defective insulator. The discriminator D_x or D_y of the previous stage is still retained in the network. During the training process, D_x and D_y are no longer updated, but only the new discriminator D₁ is updated. The retention of G_y and D_y is to constrain the compensated generator module so that the generation ability of the compensation module will not affect G_x or G_y that has been trained in the previous stage. The generators G_x and G_y can still convert the sample category, and they will no longer update the parameters.

4 Experiment

To verify the feasibility of the insulator sample conversion method mentioned above and to assess the effect of sample generation, a software platform was built in section 3. This platform collected insulator data from a specific transmission line scene. The effectiveness of the algorithm by comparing the samples was proved. In this experiment, GPU acceleration is required, and the data running platform is shown in Table 1.

Device name	Model number	Quantity
CPU	i7-12700k	1
GPU	RTX3080 12G	1
Mainboard	ROG STRIX Z690-E	1
RAM	DDR5 16G	2
Hard disk	m2 Solid state drive 1T	1

Table 1. List of experimental equipment.

View all Tables

4.1 Insulator sample conversion experiment

For different types of data above, the experiment sets corresponding conversion experiments and compares the peak signal-to-noise ratio (PSNR) and the structure similarity index measure (SSIM) indexes of the converted data with the real data in the corresponding target domain with Cycle-GAN [15] and Distance GAN [16]. In this experiment, 1000 samples of different kinds were used for mutual conversion training. The detailed results are shown in Table 2.

index	Quest	Cycle-GAN	Distance GAN	Ours
PSNR	Glass→Ceramic	18.224	12.139	24.543
	Ceramic→Glass	18.190	11.940	23.939
	Glass→Composite	17.935	211.940	23.446
	Composite→Glass	17.990	12.7262	23.105
SSIM	Glass→Ceramic	0.687	0.263	0.938
	Ceramic→Glass	0.703	0.291	0.923
	Glass→Composite	0.683	0.278	0.921
	Composite→Glass	0.690	0.280	0.913

Table 2. Comparison of sample conversion indicators of Cycle-GAN insulators based on the attention mechanism.

View all Tables

In Table 2, PSNR represents PSNR, and the larger the value, the better the image quality. From Table 2, the index of the sample transformed by the model design in this paper is significantly better than Cycle-GAN and Distance GAN. SSIM represents the structural similarity and measures how similar two images are. The obtained value ranges between [0, 1]. The higher the similarity, the more inclined the value is to 1, and otherwise, the value is to 0. According to the model structure in this paper, in the task of insulator sample conversion, both the quality of sample conversion and the degree of background information retained after sample conversion are obviously superior to Cycle-GAN and Distance GAN.

4.2 Generate image effect comparison

In this paper, on the experimental data of transmission line insulators collected in mountainous areas, the conversion augmentation of different types of insulators is compared. The overall effect diagram is Fig. 6.

Figure 6.Example of converting glass insulators into composite insulators.

As shown in Fig. 6, in the training of glass conversion to the insulator, insulator samples can achieve accurate conversion to specific targets and focus some distribution of attention in the background area to analyze the data samples. The possible reason is that the background features of the two data sets are too similar so the attention network becomes the main feature to recognize the background region in the learning process. Another reason is that the composite insulator image sample pixels can reduce the weight of background region recognition by appropriately adding samples with different background styles to the network.

The analysis is carried out by the conversion test of the composite insulator to the glass insulator in Fig. 7. When the insulator sample has a background with a large area similar to the color gamut and structure of the insulator, the attention network will also recognize the background as the insulator. The gamut of the two samples is too close, and the background also has a hierarchical feature that resembles an insulator, which is not the case in other scenes.

Figure 7.Example of converting composite insulators into glass insulators.

Fig. 8 renders the ceramic insulator converted into a glass insulator. The color gamut features have been well converted, but the gloss of the glass insulator is a detail that cannot be generated by the current network model. For samples with more reflective light, such as the first line of sample display diagram Fig. 8. The details of glass insulators are difficult to be generated with existing models. In the learning of the attention module, the attention module can accurately locate the shape features of the ceramic insulator. However, when the edge of the insulator is highly similar to the edge of the background in a small number of samples, the recognition area will spread to the background adjacent to the insulator. With the current conversion effect, the number of glass insulators is far more than that of ceramic insulators, and the number of ceramic insulators can be further expanded by converting the data volume of glass insulators to ceramic insulators.

Figure 8.Example of converting ceramic insulators into glass insulators

In the glass-to-ceramic insulator training shown in Fig. 9, the insulator sample can achieve accurate conversion to a specific target. However, some distribution of attention is concentrated in the background region, and the data sample is analyzed. The possible reason is that the background features of the two data sets are too similar, so the attention network has become the main feature to recognize the background region in the learning process. Another reason is that the composite insulator image sample pixels can reduce the weight of background region recognition by appropriately adding samples with different background styles to the network.

Figure 9.Example of converting glass insulators into ceramic insulators

4.3 Insulator defect sample generation experiment

The glass insulator is generated with the defect composite insulator and its reverse generation in the defect sample generation experiment. The glass insulator is generated with the defect ceramic insulator and its reverse generation are experimentally verified. In an experiment using 2000 real samples, the generation experiment was conducted respectively, and the sample was evaluated with the real data set sample, SSIM, PSNR, and the Fréchet inception distance (FID) parameters.

By comparing the indicators in Table 3, the performance of the migration generation model designed in this paper is significantly better than that of deep convolutional GAN [17] and Wasserstein GAN with gradient penalty (WGAN–GP )[18,19] in the evaluation parameter FID of the distribution of generated samples and real samples. According to the analysis of the signal-to-noise ratio index and the structural similarity index, the attention network and the insulator conversion module trained by prior knowledge also play a positive role in ensuring structural similarity and image quality during the generation of this method. In this paper, the PSNR and SSIM indicators can be maintained in the transformation network model to achieve the indicators.

Index	DCGAN	WGAN-GP	Ours
FID	25.33	18.78	12.54
PSNR	15.46	19.12	23.76
SSIM	0.58	0.67	0.92

Table 3. Comparison of defect sample indicators based on background PSNR, SSIM, and FID.

View all Tables

As shown in Fig. 10, in the process of mutual generation of glass and ceramic insulators, the attention network can identify and position the insulator accurately. There are defects in the process of generating glass insulators from composite insulators, which may be caused by a single sample of composite insulators being used in the training process and high background repetition. The background complexity can be increased by converting other samples to the composite insulator sample set appropriately to improve the network generalization ability.

Figure 10.Example of defect sample generation.

5 Conclusions

To solve the problem of difficulty in obtaining insulator data in the field of the power grid, this paper proposes an attention mechanism assisted method for generating adversarial network transformation samples. This method can fully utilize the existing dataset and provide a solution direction for the expansion of insulator data. Verified by the corresponding comprehensive index of data, it provides a favorable reference for expanding the insulator data of transmission lines. However, the current method cannot completely solve the problem of insulator data in the field of the power grid. The generation of data still has flaws when the color gamut is similar. Subsequent research can consider combining edge detection techniques to obtain features and generate new samples.

Disclosures

The authors declare no conflicts of interest.

References

[1] Y. Tokozume, Y. Ushiku, T. Harada, Learning from betweenclass examples f deep sound recognition, in: Proc. of the 6th Intl. Conf. on Learning Representations, Vancouver, Canada, 2018, pp. 1–13.

[2] R. Takahashi, T. Matsubara, K. Uehara, RICAP: Rom image cropping patching data augmentation f deep CNNs, in: Proc. of the 10th Asian Conf. on Machine Learning, Beijing, China, 2018, pp. 786–798.

[3] I.J. Goodfellow, J. PougetAbadie, M. Mirza, et al., Generative adversarial s, in: Proc. of the 27th Intl. Conf. on Neural Infmation Processing Systems, Montreal, Canada, 2014, pp. 2672–2680.

[4] A. Antoniou, A. Stkey, H. Edwards, Data augmentation generative adversarial wks [Online]. Available, https:arxiv.gabs1711.04340, November 2017.

[5] Lake B.M., Salakhutdinov R., Tenenbaum J.B.. Human-level concept learning through probabilistic program induction. Science, 350, 1332-1338(2015).

[6] A. Radfd, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial wks, in: Proc. of the 4th Intl. Conf. on Learning Representations, San Juan, America, 2016, pp. 1–15.

[7] M. FridAdar, E. Klang, M. Amitai, J. Goldberger, H. Greenspan, Synthetic data augmentation using GAN f improved liver lesion classification, in: Proc. of the 15th IEEE Intl. Symposium on Biomedical Imaging, Washington, America, 2018, pp. 289–293.

[8] C. Han, K. Murao, T. Noguchi, et al., Learning me with less: Conditional PGGANbased data augmentation f brain metastases detection using highlyrough annotation on MR images, in: Proc. of the 28th ACM International Conference on Infmation Knowledge Management (CIKM ''19). Association f Computing Machinery, Beijing, China, 2019, pp. 119–127.

[9] J. Redmon, A. Farhadi, YOLOv3: An incremental improvement [Online]. Available, https:arxiv.gabs1804.02767, April 2018.

[10] X.Y. Zhu, Y.F. Liu, J.H. Li, T. Wan, Z.C. Qin, Emotion classification with data augmentation using generative adversarial wk, in: Proc. of the 22nd PacificAsia Conf. on Knowledge Discovery Data Mining, Melbourne, Australia, 2018, pp. 349–360.

[11] J.Y. Zhu, T. Park, P. Isola, A.A. Efros, Unpaired imagetoimage translation using cycleconsistent adversarial wks, in: Proc. of IEEE Intl. Conf. on Computer Vision, Venice, Italy, 2017, pp. 2242–2251.

[12] T. Kim, M. Cha, H. Kim, J.K. Lee, J. Kim, Learning to discover crossdomain relations with generative adversarial wks, in: Proc. of the 34th Intl. Conf. on Machine Learning, Sydney, Australia, 2017, pp. 1857–1865.

[13] Z.L. Yi, H. Zhang, P. Tan, M.L. Gong, DualGAN: Unsupervised dual learning f imagetoimage translation, in: Proc. of IEEE Intl. Conf. on Computer Vision, Venice, Italy, 2017, pp. 2868–2876.

[14] Hu J., Shen L., Albanie S., Sun G., Wu E.-H.. Squeeze-and-excitation networks. IEEE T. Pattern Anal., 42, 2011-2023(2020).

[15] Z. Liang, J.X. Huang, CycleGAN with dynamic criterion f malaria blood cell image synthetization, in: Proc. of AMIA Jt Summits Transl Sci Proc, Online, 2022, pp. 323–330.

[16] S. Benaim, L. Wolf, Onesided unsupervised domain mapping, in: Proc. of the 31st Intl. Conf. on Neural Infmation Processing Systems, Long Beach, America, 2017, pp. 752–762.

[17] B. Liu, J. Lv, X. Fan, et al., Application of an improved DCGAN f image generation, Mobile Infmation Systems 2022 (July 2022) 1–14.

[18] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein generative adversarial wks, in: Proc. of the 34th Intl. Conf. on Machine Learning, Sydney, Australia, 2017, pp. 214–223.

[19] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. Courville, Improved training of wasserstein GANs, in: Proc. of the 31st Intl. Conf. on Neural Infmation Processing Systems, Long Beach, America, 2017, pp. 5769–5779.

微信扫一扫：分享

微信扫一扫：分享