I. Introduction
Synthetic aperture radar (SAR) [1] has been a very useful application to detect any interesting objects in any weather conditions. Unlike optical sensors, SAR can obtain images from a desired area, regardless of day, night, or weather conditions. It has been used mainly in the field of surveillance and reconnaissance for military purposes. Until recently, there were limitations in manually detecting and recognizing targets of interest (TOIs) due to data acquired in large quantities. To prevent this difficulty, progression in automatic target detection (ATD) technology [2] has been actively pursued. Therefore, automatic detection of TOIs is a very important process in SAR.
There have been many approaches [3–9] to detecting TOIs. The most popular approach is the constant false alarm rate (CFAR)-based algorithms, which use sliding windows. All the CFAR-based approaches are focused on the brightness and contrast of objects. The CFAR, via extended fractal (EF) features, has introduced another feature, such as the size of objects [10]. However, there are two critical problems with CFAR-based algorithms. First, one cannot distinguish between targets and clutter. Second, too many false alarms (FAs) are generated due to bright clutter. Therefore, CFAR-based algorithms require a discriminating procedure to reduce clutter. Several discrimination techniques have been proposed to remove clutter [11–13]. The MIT Lincoln Laboratory introduced a prescreened stage to eliminate natural clutter and pass man-made objects [12]. Fifteen features are used in the discrimination algorithm. A rank-based feature selection scheme has been proposed to obtain a high-discrimination performance [13]. Recently, research has progressed [14–16] to improve conventional CFAR. In ATD, the trade-off between the FA rate and the target detection probability is not an issue that is easily solved.
The advent of deep learning technology makes it easy to detect targets automatically [17–20]. SAR target recognition based on convolutional neural networks (CNN) [17] is applied to MSTAR public datasets. ATD based on CNN has been researched in ship detection [18]. Another ship detection study was conducted with YOLOv2 (you only look once version 2) [19]. The detection of ocean internal waves has been analyzed and researched using faster regions with CNN (Faster R-CNN) [20]. However, these methods have been very limited in the use of different deep learning networks and in the difficult procedures of target detection.
A deep learning-based compact weighted binary classification (DL-CWBC) is proposed to separate and classify targets and clutter in SAR images. The proposed algorithm uses the ResNet-101 network of Microsoft Research [21]. For pre-processing, the conventional CFAR detects targets and clutter. The obtained clutter can be used directly as one class of a binary classifier. Nevertheless, targets should be obtained from ground truth because conventional CFAR cannot perfectly detect targets. Targets and clutter were trained as a binary classifier with ResNet-101. One of the biggest advantages of using the deep learning network for target detection is that there is no need for complicated clutter removal techniques. ResNet-101 automatically removes clutter, which generates many FAs. Through the well-learned ResNet-101 with sufficient data sets, the performance of deep learning is better than that of various CFAR-based algorithms. These are the two key ingredients of the proposed algorithm: one is a new, defined cross-entropy error function, which controls the trade-off between the FA rate and target detection, and the other is an extreme distinction decision. The proposed approach is to obtain the optimal goal of not missing targets and to obtain the optimal FA rate. The proposed approach has a detection rate similar to that of conventional CFAR algorithms. The main contribution of the DL-CWBC is dramatically improving the removal rate of the clutter when compared with the conventional CFAR with the discrimination algorithm. In addition, the DL-CWBC does not miss any targets from the ground truth.
The outline is organized as follows: a traditional CFAR-based algorithm is introduced in Section II, a ResNet-101 deep learning network is described in Section III, Section IV provides a description of a DL-CWBC technique to discriminate targets and clutter in SAR images, Section V demonstrates the performance of the proposed technique, and finally, the paper is concluded in Section VI.
II. A Conventional CFAR-Based Detection Algorithm
Traditional target detection algorithms for SAR images are usually based on CFAR. However, the algorithms should have a discrimination procedure to obtain the desired results. Therefore, the algorithms should have a two-stage process. One is a CFAR-based algorithm for detecting targets and clutter. The other is a discrimination technique for reducing clutter. First, a conventional CFAR algorithm is needed to detect all targets and clutter, including FAs. A conventional CFAR algorithm is shown in Fig. 1. In a SAR image, a rectangular sliding window inspects each pixel and detects pixels with bright intensity. In the central figure, the bright pixels are marked with a zoomed square box. The detailed structure of the CFAR window is shown in the figure on the right. The value of the detection feature in the CFAR is computed by:
where I[m,n] is the bright intensity of a pixel [m,n], and μ̂c[m,n] and σ̂c[m,n] are the mean and standard deviation of the clutter window surrounding a pixel [m,n].
There are three main steps to clustering bright pixels and detecting TOIs. The procedure for clustering bright pixels is shown in Fig. 2. First, the detected bright pixels are labeled. Second, a morphological operation enables close pixels to merge into one region. Finally, several regions formed in this way are clustered. This is the simple clustering and detecting procedure of TOIs in the conventional CFAR algorithm.
In addition, CFAR-based algorithms could include many other complicated procedures to detect targets. These might require other procedures to remove clutter. The discrimination algorithm might be needed to consider many features to eliminate clutter, such as standard deviation, fractal dimension, weighted fill rank ratio, etc. A discriminator using these features can decide whether it is a target or clutter. The details of this implementation are beyond the scope of this paper.
III. ResNet-101 Deep Learning Network
In this section, a deep learning network is presented. Fig. 3 shows the detailed layers of the ResNet-101 network. The numbers in the rectangular box indicate the dimensions and number of filters. In the first block, 7 × 7 indicates the height and width of a filter, 64 is the number of filters, and stride is 2. In the second block, 3 × 3 maxpooling is represented and stride is 2. The following blocks have similar structures: the total number of layers is 101. A detailed description is provided by He [21]. This paper uses a ResNet-101 network to classify targets and clutter, which are detected by CFAR and manually pre-categorized. Here, the ResNet-101 network uses a simple two-class classifier. The well-learned ResNet-101 network should have much better performance than the CFAR-based algorithms with a complicated discriminator using dozens of features.
IV. A Deep Learning-Based Compact Weighted Binary Classification Technique
In this paper, the main purpose of DL-CWBC is to reduce the FA rate and maximize the probability of target detection. The two unique features of a DL-CWBC are to control the trade-off between the FA rate and target detection and to use the extreme distinction decision. The DL-CWBC procedure is described in Fig. 4. There are three major stages in this paper. In the pre-processing stage, targets and clutter should be detected through the traditional CFAR algorithm. However, conventional CFAR cannot automatically distinguish between targets and clutter. Therefore, we manually separated them to use as a training set in the ResNet-101 network. The CFAR could not completely detect the targets. Therefore, the proposed approach should rely on ground truth to obtain all targets. For the training, targets and clutter were collected as the chips of the two classes. The well-learned ResNet-101 performed much better than the conventional CFAR. However, the conventional loss function could not completely resolve the problem of the trade-off between detection probability and the FA rate. In the second stage, the targets and clutter are trained with the ResNet-101 network. The cross-entropy error function is used to discriminate whether it is a target or clutter in an SAR image chip. The conventional cross-entropy error function is given by:
where t=1 is clutter and t=0 is a target, y(X,w) has the output value between 0 and 1. X is the input image, and w is the weighting vector from the deep learning network.
The last stage is the testing stage with the well-trained ResNet-101 network. In this paper, we proposed two unique features: a modified cross-entropy error function and an extreme distinction decision. In the traditional approach, the target and clutter are equally weighted for the error function. By training two classes of target and clutter, the false alarm rate that occurs in the traditional CFAR is largely eliminated. The main focus of this paper is not to miss any targets for detection. Therefore, we modified the cross-entropy error function to detect the targets perfectly. The modified cross-entropy error function is given by:
where 0 ≤ λ ≤ 1. λ is the weighting coefficient of the error function, which helps to reduce the FA rate and detect almost all TOIs without missing any. The reason why λ is less than 1 is to put more weight on the target side. However, using the weighted coefficient does not perfectly detect the targets. Now, an extreme distinction decision is declared for perfect target detection. The key idea of the extreme distinction decision is that it is considered a target if the probability of being a target is greater than 0. Finally, an extreme distinction decision is described as follows:
We might manually find an optimal λ. However, an optimal value of λ is determined by the ResNet-101 network. The proposed algorithm based on “simplicity” provides rigorous control of the trade-off between the false alarm rate and the probability of target detection.
V. Experiment Results
The DL-CWBC algorithm with or without extreme distinction decisions was developed, experimented with, and validated by a ResNet-101 network. First, the conventional CFAR algorithm detects targets and clutter simultaneously without discrimination. After the conventional CFAR, the targets and clutter are manually classified into two classes. Targets and clutter were trained through ResNet-101. The numbers of targets and clutter in training and testing sets are shown in Table 1. The number of targets and clutter in a training set are 3,441 and 79,136, respectively. The number of targets and clutter in a testing set is 788 and 11,604, respectively.
Table 2 shows the comparison between the performance of the traditional CFAR, that of CFAR with a discrimination algorithm, and that of the DL-CWBC algorithms. The CFAR detects 12,392 targets and clutter. The probability of the detection rate was 99.1%. However, there are many FAs. The CFAR could not distinguish whether it was a target or clutter. The detected clutter was not automatically removed. Through various discrimination algorithms, the FA rate can be partially improved. The feature selection of discriminating algorithms is not automatic. The number of features of clutter shown in SAR images should be defined properly. This may require a more complicated procedure. From the devised features from MIT Lincoln Laboratory [12], we found an optimal set of five dominant features: standard deviation, fractal dimension, mass, rotational inertia, and maximum CFAR. The probability of target detection was 99.3%. However, the probability of clutter removal was 57.4%. The conventional CFAR with a discrimination algorithm could obtain a high probability of target detection. Nevertheless, it still has many FAs. The proposed algorithm uses a deep learning network, which can automatically select and learn various features. The deep learning approach removes complicated discriminating procedures. In comparison, the equally weighted DL-CWBC of the target and clutter is shown with equal weight (λ=1). Without the extreme distinction decision, the probability of target detection is 97.6% and that of clutter removal is 99.8%. However, 19 targets were missed. To improve the performance of target detection, an extreme distinction decision was used. The result for the number of missing targets improved from 19 to 5. The removal rate of FA decreased to 1.6%. However, the FAs had pretty much been removed when compared with the CFAR. The main focus is not to miss any target through detection. The default value of λ is not enough, and an optimal value needs to be found.
There are still five missing targets. The λ could be also used as a hyper-parameter for the ResNet-101 network. There are so many values between λ= 0 to 1. Therefore, ResNet-101 is trained to detect all the targets at a step size of λ= 0.05. Nevertheless, Table 3 shows the performance of the DL-CWBC algorithm in two specific cases, λ= 0.5 and 0.1. In λ= 0.5, the probability of target detection and clutter removal is not much different from the equally weighted case. Missing targets are still 5 even with extreme distinction decision. However, there are missing targets at λ = 0.1. The probability of the removal rate of FA is degraded from 98.2% to 94.5%. Fig. 5 shows the results for the number of undetected targets and clutter versus the weighting coefficient of the modified cross-entropy error function.
At the top of Fig. 5, the plot of the number of undetected targets versus the value of λ are shown. The solid and dashed lines indicate the number of undetected targets versus λ, with or without the extreme distinction decision, respectively. Without an extreme distinction decision, there are dozens of undetected targets. After applying the extreme distinction decision, the overall number of undetected targets decreased. The optimal value of λ is found to be 0.1. The bottom of Fig. 5 shows the plot of the number of not-removed clutter versus the value of λ. The solid and dashed lines represent decisions with or without extreme distinctions, respectively. The results are automatically attributed to the above results from target detection. The big arrows in the figures indicate the trade-off between undetected targets and unremoved clutter. As undetected targets decrease in the top figure, unremoved clutter increases at the bottom.
VI. Conclusion
A DL-CWBC algorithm was developed, analyzed, and tested with through a ResNet-101 deep learning network with a modified cross-entropy error function. The extraordinary achievement of the proposed algorithm is due to its simplicity: not using a complicated discriminator combined with so many features of bright pixels. The key ingredient is to control the trade-off between the FA rate and target detection. Approximately 95% of the clutter among the detected objects was removed from the CFAR without any knowledge of clutter features. In addition, all the targets from the ground truth were perfectly detected with the extreme distinction decision. The proposed DL-CWBC algorithm proved to be simple and efficient for perfectly detecting targets and efficiently removing clutter.