# 1st place solution in MICCAI 2020 TN-SCUI challenge
This is the source code of the 1st place solution for segmentation task (with IoU 82.54%) in 2020 TN-SCUI challenge.
[[Challenge leaderboard](https://tn-scui2020.grand-challenge.org/evaluation/leaderboard/)]
## Pipeline of our solution
We use a simple cascaded framework for segmenting nodules, it can be easily extended to other single-target segmentation tasks.
Our pipeline is shown in the figure below.
![something](https://github.com/WAMAWAMA/TNSCUI2020-Seg-Rank1st/blob/master/pic/%E5%88%86%E5%89%B2%E8%AE%AD%E7%BB%83%E6%B5%8B%E8%AF%95%E8%BF%87%E7%A8%8B.svg)
<details>
<summary>Click here to view more details of method</summary>
## Method
### Data preprocessing
Due to different acquisition protocols, some thyroid ultrasound images have irrelevant regions (as shown in the first Figure). First, we remove these regions which may bring redundant features by using a threshold-based approach. Specifically, we perform the operation of averaging along the x and y axes on original images with pixel values from 0 to 255, respectively, after which rows and columns with mean values less than 5 are removed. Then the processed images are resized to 256×256 pixels as the input of the first segmentation network.
### Cascaded segmentation framework
We train two networks which share the same encoder-decoder structure with Dice loss function. In practice, we choose **`DeeplabV3+ with efficientnet-B6 encoder`** as the first network and the second network. The first segmentation network (at stage Ⅰ of cascade) is trained to provide the rough localization of nodules, and the second segmentation network (at stage Ⅱ of cascade) is trained for fine segmentation based on the rough localization.Our preliminary experiments show that the provided context information in first network may do not play a significant auxiliary role for refinement of the second network. Therefore, we only train the second network using images within region of interest (ROI) obtained from ground truth (GT). (The input data is the only difference in the process of training the two networks.)
![something](https://github.com/WAMAWAMA/TNSCUI2020-Seg-Rank1st/blob/master/pic/%E5%A4%A7%E5%B0%8F%E7%BB%93%E8%8A%82%E5%AF%B9%E6%AF%94.svg)
When training the second network, we expand the nodule ROI obtained from GT, then the image in the expanded ROI is cropped out and resized to 512×512 pixels for feeding the second network. We observe that, in most cases, the large nodule generally has a clear boundary, and the gray value of small nodule is quite different from that of surrounding normal thyroid tissue (as shown in the above figure). Therefore, background information (the tissue around the nodule) is significant for segmenting small nodules. As shown in Figure below, in the preprocessed image with the size of 256×256 pixels, the minimum external square of the nodule ROI is obtained first, and then the external expansion m is set to 20 if the edge length n of the square is greater than 80, otherwise the m is set to 30.
![something](https://github.com/WAMAWAMA/TNSCUI2020-Seg-Rank1st/blob/master/pic/%E5%88%86%E5%89%B2%E9%A2%84%E5%A4%84%E7%90%86%E8%BF%87%E7%A8%8B.svg)
### Data augmentation and test time augmentation
In both two task, following methods are performed in data augmentation: 1) horizontal flipping, 2) vertical flipping, 3) random cropping, 4) random affine transformation, 5) random scaling, 6) random translation, 7) random rotation, and 8) random shearing transformation. In addition, one of the following methods was randomly selected for additional augmentation: 1) sharpening, 2) local distortion, 3) adjustment of contrast, 4) blurring (Gaussian, mean, median), 5) addition of Gaussian noise, and 6) erasing.
Test time augmentation (TTA) generally improves the generalization ability of the segmentation model. In our framework, the TTA includes vertical flipping, horizontal flipping, and rotation of 180 degrees for the segmentation task.
### Cross validation with a size and category balance strategy
5-fold cross validation is used to evaluate the performance of our proposed method. In our opinion, it is necessary to keep the size and category distribution of nodules similar in the training and validation sets. In practice, the size of a nodule is the number of pixels of the nodule after unifying preprocessed image to 256×256 pixels. We stratified the size into three grades: 1) less than 1722 pixels, 2) less than 5666 pixels and greater than 1722 pixels, and 3) greater than 5666 pixels. These two thresholds, 1722 pixels and 5666 pixels, were close to the tertiles, and the size stratification was statistically significantly associated with the benign and malignant categories by the chisquare test (p<0.01). We divided images in each size grade group into 5 folds and combined different grades of single fold into new single fold. This strategy ensured that final 5 folds had similar size and category distributions.
</details>
**In summary, what we do in our solution are:**
- preprocessing to remove irrelevant regions
- using a cascaded framework
- 5-fold cross-validation (CV) strategy with a balanced nodule size and category distribution
- using test time augmentation (TTA)
- model ensembling: since we trained two networks separately in 5-fold CV , we combined any one first network and one second network as a pair, and finally we got 25 pairs (or inference results). We use [`step4_Merge.py`](https://github.com/WAMAWAMA/TN_SCUI_test/blob/master/step2to4_train_validate_inference/step4_Merge.py) to merge 25 inference results into a final ensemble result by pixel-wised voting
## Segmentation results on 2020 TN-SCUI training dataset and DDTI dataset
We test our method on 2020 TN-SCUI training dataset(with 3644 images or nodules, malignant 2003 : benign 1641). The segmentation results of 5-fold CV based on "DeeplabV3+ with efficientnet-B6 encoder" are as following:
|fold|Stage Ⅰ|TTA at stage Ⅰ|Stage Ⅱ|TTA at stage Ⅱ|DsC|IoU (%)|
|-------------|:-:|:-:|:-:|:-:|:--:|:--:|
| 1 |√ | | | |0.8699|79.00|
| 1 |√ |√ | | |0.8775|80.01|
| 1 |√ | |√ | |0.8814|80.75|
| 1 |√ |√ |√ | |0.8841|81.05|
| 1 |√ | |√ |√ |0.8840|81.16|
| 1 |√ |√ |√ |√ |0.8864|81.44|
| 2 |√ |√ |√ |√ |0.8900|81.99|
| 3 |√ |√ |√ |√ |0.8827|81.07|
| 4 |√ |√ |√ |√ |0.8803|80.56|
| 5 |√ |√ |√ |√ |0.8917|82.07|
<details>
<summary>Click here to view complete TNSCUI segmentation results </summary>
|fold|Stage Ⅰ|TTA at stage Ⅰ|Stage Ⅱ|TTA at stage Ⅱ|DsC|IoU (%)|
|-------------|:-:|:-:|:-:|:-:|:--:|:--:|
| 1 |√ | | | |0.8699|79.00|
| 1 |√ |√ | | |0.8775|80.01|
| 1 |√ | |√ | |0.8814|80.75|
| 1 |√ |√ |√ | |0.8841|81.05|
| 1 |√ | |√ |√ |0.8840|81.16|
| 1 |√ |√ |√ |√ |0.8864|81.44|
| 2 |√ | | | |0.8780|80.16|
| 2 |√ |√ | | |0.8825|80.80|
| 2 |√ | |√ | |0.8872|81.52|
| 2 |√ |√ |√ | |0.8873|81.56|
| 2 |√ | |√ |√ |0.8894|81.91|
| 2 |√ |√ |√ |√ |0.8900|81.99|
| 3 |√ | | | |0.8612|78.22|
| 3 |√ |√ | | |0.8744|79.77|
| 3 |√ | |√ | |0.8710|79.59|
| 3 |√ |√ |√ | |0.8808|80.66|
| 3 |√ | |√ |√ |0.8753|80.30|
| 3 |√ |√ |√ |√ |0.8827|81.07|
| 4 |√ | | | |0.8664|78.53|
| 4 |√ |√ | | |0.8742|79.44|
| 4 |√ | |√ | |0.8742|79.80|
| 4 |√ |√ |√ | |0.8777|80.12|
| 4 |√ | |√ |√ |0.8771|80.27|
| 4 |√ |√ |√ |√ |0.8803|80.56|
| 5 |√ | | | |0.8820|80.44|
| 5 |√ |√ | | |0.8874|81.22|
| 5 |√ | |√ | |0.8869|81.38|
| 5 |√ |�