Classification of COVID‑19 patients from HRCT score prediction in CT images using transfer learning approach

,


Introduction
The COVID-19 pandemic spread in the year 2019 which is caused by SARS COV2 posed a major threat to human life, health, and productivity, over the globe.Ultraviolet (UV) radiation and heat cause the health issues; similarly, COVID-19 virus damages the health of other members in the family.Subsequently, COVID-19 instances are discovered and classified by computer-aided programs using artificial intelligence (AI) and deep learning (DL).To identify and classify COVID-19 cases based on known symptoms including Page 2 of 13 Tembhurne Journal of Electrical Systems and Inf Technol (2024) 11:4 • High proportion of false negative tests are being observed in the RT-PCR nasal tests and Rapid Antigen tests.• The results of RT-PCR test require 48-72 h; thus, the patient may affect his/her family and surrounding.• Existing techniques find the person is COVID-19-infected or not, but fail to identify the severity of the infection using a standard procedure.• Calculating the HRCT severity score can help the patient to take necessary actions as it gives few prognosis predictions.
In this paper, we utilized transfer learning for classification of COVID-19 patients with five different models, namely ResNet [1], VGG16 [2], VGG19 [3], Inception [4] and Xception [5] which offers better accuracies.Similar models have been proposed but achieved limited performance on the completely new or randomized datasets.In [6], the proposed model reported an accuracy of 87% on the dataset under experimentation, but achieved very poor accuracy of 56% on CT scan images.So, we proposed a voting-based classification among the five proposed models and predicted the result.Moreover, the proposed model shows better results compared to individual models.
It is important to classify the patient into COVID-19 positive or negative, but there is a larger need to find the severity of infection.This helps patient by taking necessary medication and preventive measures to get cured or stop increasing the infection which might have a future consequence.So, HRCT is the best method to give such prognostic conclusion for COVID-19.The lung CT image segmentation firstly requires lung image analysis, and it is needed for accurate lung CT image analysis for classification and severity prediction of COVID-19.In addition to classifying the CT scan to COVID-19 positive or negative image, we have also proposed a method to calculate the HRCT score from the CT image by using a U-Net model [7] which is generally employed for biomedical image segmentation.We have automated the process of finding the HRCT score by finding the area of lung infection using image segmentation.To achieve this, we built two models similar to U-Net and reported a Dice coefficient of 0.83 for both lung mask and infection mask segmentation, respectively.This research is structured as follows: The "Related works" section provides related research works in the field of CT scan for COVID-19 identification.The "Methods and materials" section provides the methodology applied in this work.The results and discussions are presented in the "Results and discussions" section, and the "Conclusion and future scope" section presents the conclusion of proposed work.

Related works
In [8], the authors experimented on various deep CNN baseline models such as VGG16, InceptionV3, ResNet50, DensNet121 and DenseNet201 and fused all the models into single prediction node with sigmoid activation.The main advantage of decision fusion is that the mistakes of individual models are dealt with combining the individual predictions via majority voting approach which improves the overall efficiency of the baseline models.The authors claim that the decision fusion gives good results compared to each individual model.Ahmad Chaddad et al. [9] proposed transfer learning-based deep model, a powerful strategy that enables quickly and effectively train of deep neural networks with a limited amount of labeled data.The basic idea of strategy is to use a pretrained network on a large available dataset and then use the features of this network as a representation for learning a new task without retraining from scratch.The transferred features can be directly utilized as an input to the new model or adapted to the new task.The six CNN models such as AlexNet, DenseNet, GoogleNet, NasNet-mobile, DarkNet and ResNet18 are tested on dataset of 746 CT images and then added extra features such as ground-glass opacities (GGO), consolidations, and pleural effusions.In [10], InfNet, a new COVID-19 lung CT infection segmentation network, is proposed wherein implicit reverse attention and explicit edge attention are applied to improve infected region detection.In addition, semi-InfNet, a semi-supervised approach, is developed to address the scarcity of high-quality labeled data.Extensive testing on COVID-SemiSeg dataset and actual CT dataset shows that InfNet and semi-InfNet segmentation models outperform cutting edge segmentation methods and advance state of the arts.The dice coefficient of 0.739 is reported for the proposed task.
In [11], the authors worked on two different tasks, i.e., joint classification and joint segmentation.The diagnosis system provides explainable diagnosis results for medical staff fighting against COVID-19.The COVID-CS dataset containing 3,855 fine-grained pixel level labeled CT images from 200 COVID-19 patients, 64,771 patient-level annotated CT images from 200 other COVID-19 patients, and 75,541 CT images of 350 uninfected cases was selected for model training.The proposed system includes an explainable classification branch for detecting COVID-19 opacifications and a segmentation branch for locating opacification zones.The result is obtained after combining the two branches, and the dice coefficient of 0.785 is reported.
Subsequently, a survey on image classification of COVID-19 is presented in [23], and the authors discussed the challenges in classification, advancement in the deep models and future opportunities to improve the results of COVID-19 classification.In [24], deep CNN along with grey wolf optimizer is applied on chest X-ray images to predict the COVID-19.A tri-stage chest X-ray image is utilized to identify the COVID-19 from the different classes of models, i.e., 2-class, 3-class, and 4-class.Sufian et al. [25] designed the three channel grayscale CT image representation consisting of contrast limited adaptive histogram equalization (CLAHE) and histogram equalization (HE).These different information are evaluated on different pretrained deep models such as VGG16, MobileNet, ResNet50, etc.The results obtained by these models are significant on CT images.In [26], deep learning-and machine learning-based models are experimented for automated COVID-19 cases detection from chest X-ray images.The DenseNet169 [27] is applied to extract the features and classification is performed by extreme gradient boosting (XGBoost) [28].The results are compared with existing methods, and the authors claimed the faster and accurate prediction of COVID-19 cases with acceptable performance.

Methods and materials
One of the main reasons for considering this work is due to the drawbacks that are present in the RT-PCR and Rapid Antigen test.RT-PCR test does not give the severity of the COVID affected patient.Here, high proportion of false negative tests are being observed in the RT-PCR nasal tests and Rapid Antigen test.It is also being suspected that the mutant virus may be easily dodging the RT-PCR tests.It takes 48-72 h to obtain the result from RT-PCR test; however, CT scan only takes 5-10 min.By the time the result comes from the RT-PCR test, the patient might have affected several other people in his/her family and surroundings.So, to overcome this issue, we propose deep learning techniques to classify a person into COVID positive or negative and also identifies the severity of the infection.Some existing methods work well and reported good results on a particular dataset, but miserably failed to display similar results on a completely new, different, and random dataset.

Dataset
The dataset utilized for the training of five transfer learning models for COVID classification is prepared from [12].The COVID-CT dataset consists of 349 CT images of clinical findings of COVID-19 from 216 patients.In addition to these positive CT scans, the dataset also contains COVID-negative CT scans, which is collected from different sources listed in [13][14][15][16].The other dataset [17] contains 20 CT scans (with 301 slices for each image) of patients diagnosed with COVID-19 as well as segmentations of lungs and infections made by experts.Figure 1 is the sample images for CT scan, lung mask and infection mask.To balance the dataset for the training and testing/validation, we applied data augmentation technique.The data augmentation includes rotation, flips, etc.

Methodology for classification
This work is broadly divided into two phases: (1) classification of CT scan images into COVID-19 positive or negative images and (2) finding the HRCT score of the CT scan to find the severity of infection in the person.The overall architecture of the proposed system is shown in Fig. 2. The input image is supplied to five pretrained CNN models, then voting mechanism is applied to decide, whether COVID positive or negative, and further, HRCT score is calculated for the COVID positive image.

Preprocessing
In training DL models, preprocessing is a prevalent practice.Preprocessing techniques can be effective for reducing undesired noise, highlighting parts of the picture that can aid in recognition, and even assisting with the training phase.The images in the dataset are of different sizes.Thus, we needed to resize to a fixed size before supplying to the DL models for training.Thus, based on the input dimension of deep models, images are resized, i.e., 224 × 224, etc.

Training for classification
We have trained five models on the dataset using the transfer learning approach.DL architectures, namely VGG16, Inception, Xception, VGG19 and ResNet50, are utilized for classification of the COVID-19 patients.All the models classify the CT scan images into two classes, solving the problem of binary classification.
We have resized the images to 224 × 224 and added the input tensor of shape (224, 224, 3) to the pretrained models, where 3 is the number of channels.Further, we performed flattening of features by adding flatten layer; moreover, a dropout layer is added to overcome overfitting problem.Finally, dense layer is added using SoftMax as the activation function for classification.
A batch size of 32 is set for all the models at the time of training.For model optimization, to achieve the better results in order to minimize the losses, categorical crossentropy is used; further, Adam optimizer is employed for model optimization.Further, dataset is divided in the ratio of 4:1 for training and validation sets and models training is performed with 150 epochs.

Preprocessing for segmentation
It is very important to preprocess the images especially while building models for medical image segmentation to achieve suitable results.With the CT scan images, we have applied CLAHE for histogram equalization to improve the contrast in an images.Cropping is performed to remove unwanted part/noise in the image; later, cropped images are resized into 100 × 100.

U-net architecture
U-Net architecture is created for analyzing and segmenting medical images [7].The architecture of U-Net model is presented in Fig. 3.This network's design is divided into two parts-contractive and expansive.Following ReLU layers, the contracting route consists of numerous patches of convolutions with filters of size 3 × 3 and unity strides in both directions.This path extracts the input's essential features and returns a feature Fig. 2 Proposed system for COVID-19 detection vector of a certain length.The second route extracts data from the contractive path by copying and cropping, as well as from the feature vector using up-convolutions, and forms an output segmentation map via a series of operations.The operation that connects the first and second pathways is a crucial part of this architecture.This connection enables the network to obtain highly accurate data from the contractive path, resulting in a segmentation mask, i.e., close to the intended output as feasible.

Training for segmentation
The U-Net [7] model is employed for the task of segmentation.For training, we have chosen 80% of data and 20% of data are reserved for testing.A batch size of 128 is set for training.For optimization, the Adam optimizer is utilized.Further, the model is trained on the training dataset for 200 epochs.Here, we have built two separate models-(1) one for lung segmentation detection, and (2) second for infection mask segmentation.

ANOVA (analysis of variance)
The issues may raise while creating large dataset which affects the performance of developed model.Thus, consistent and distinctive feature selection methodology needs to be investigated [37].The selection of feature contributes in visualizing the data and helps in understanding storage requirement, training time, measurement minimization, and handling dimensionality issue to improve the performance of model [38].So, ANOVA [39] statistical method compares different independent means and ranks the features by computing variance ratios [40].ANOVA test is performed to examine the statistical dissimilarity for the evaluated models.The ratio signifies the strong correlation of δ th feature linked with group parameters.Equation (1) represents the ratio R for δ th g-gap for two instances: where sample 2 B (δ) and sample 2 W (δ) denote the sample variance among the group and within the group, respectively.The sample 2 B (δ) and sample 2 W (δ) can be obtained as follows: The M represents the no. of groups, total no. of samples is denoted by N, f ij (δ) is the frequency belongs to δ th feature, and n i is the no. of samples (ith group).
The CT images are supplied to the network for extracting features from proposed deep models, and resultant 2400 features are extracted.It is observed that with few samples and many features DL model overfits; thus, model's performance degrades.So, feature selection using ANOVA test is employed to improve classifier performance and minimize the time of classification.Thus, to achieve better classification, efficient features are selected in the range of 70 to 700 features.

Results and discussion
The experiment setup for both the classification phase and the segmentation phase is similar.The training is performed on the GPU runtime environment in the Google Colab [18].We have selected 80% of the dataset for training and 20% for testing for both the phases.Further, time recorded to run each epoch for the classification phase is approximately 11 s, 500 ms for VGG16 and VGG19, and approximately 9 s for other models.In addition, in segmentation phase, each epoch for both the lung mask segmentation and infection mask segmentation on the GPU time recorded is approximately 6 s, 300 ms.For performance analysis, accuracy, precision, recall and F1-score are chosen as evaluation metrics.After extracting features from proposed model, 87 features are selected by ANOVA from 2400 features, reducing features by 94%.Table 1 presents the comparison of five models based on various performance metrics.Moreover, the accuracy vs epochs and loss vs epochs results are presented in Figs. 4, 5, 6, 7, 8. From Table 1, we can observe that the VGG19 model shows the best accuracy of 84% among all the models and ResNet50 shows the lowest accuracy.Moreover, VGG16 shows the highest precision of 89% and ResNet50 shows the lowest precision of 68%.Further, Xception model achieved better recall and F1 score in comparison with other models.Overall, VGG19 shows the better results, compared to other models.
Subsequently, we have utilized Dice coefficient as an evaluation metric for the segmentation task.The Dice coefficient is a basic and effective summary measure of spatial overlap that can be used in investigation of image segmentation repeatability and accuracy.The Dice coefficient is calculated by multiplying the area of overlap by the total (1)    Table 2 shows the comparison with state of the arts wherein the proposed model reported the higher Dice score compared with [10,11,19], and matching Dice score compared with [20].Moreover, Dice score improvement reported by the proposed model on [10] is 0.091, on [11] is 0.045, on [19] is 0.026, on [33] is 0.01, on [34] is 0.09, on [35] is 0.026, and on [36] is 0.04.

HRCT score calculation
Most of the researchers performed the research work till segmenting the CT image into lung mask and infection mask.Some of the work only targeted the segmented infection masks from the CT images, but further findings are not performed to identify meaningful insights from segmented masks.In this work, in addition to classifying the CT image and segmenting it, we also propose a method to compute high-resolution CT (HRCT) score of the infected person.HRCT of the chest is also known as HRCT chest or HRCT of the lungs; it is a CT technique that uses thin-slice chest images and a high-spatial-frequency reconstruction algorithm to post-process them.This method produces images with exceptional lung detail, which are useful for diagnosing diffuse interstitial lung disease.In the case of diffuse interstitial lung disease, HRCT is particularly effective for determining the extent of the disease, monitoring treatment response, and selecting biopsy sites.It is usually in the range from 0 to 25. Figures 9 and 10 show the division of lung mask into lobes and division of infected mask into lobes.
The formula for computing the area of infection is given as: It is calculated by finding the area of infection of each of the five lobes of the lung as shown in Figs. 9 and 10, and assigning the score according to the area of damage which is shown in Table 3.Moreover, Table 4 presents the accuracy comparison of proposed model with the state-of-the-arts.Here, we can see that the proposed model accuracy (83.5%) is better as compared to other methods.We also observe the improvement in the accuracy ranging from 0.48 to 24.38% in comparison with [6,21,22,25,26,[29][30][31][32] by the proposed model.

Table 2 Comparison with existing papers
References Objective Dice coefficient [10] Lung infection detection using deep network from CT images 0.739 [11] Segmentation and classification from CT Images 0.785 [19] Lung infection detection using deep network from CT images (limited data) 0.804 [20] Segmentation of CT Image using Supervised Network 0.83 [33] Lung metastases from CT images 0.82 [34] Segmentation from CT Images 0.74 [35] Lung segmentation from CT Images 0.804 [36] Lung segmentation from CT Images 0.79

Proposed work
Classification of COVID-19 using HRCT score using CT images 0.83

Conclusion and future scope
In this paper, we developed five models for classifying a CT scan image into COVID-19 positive or negative using transfer learning approach.We observed that VGG19 shows the best performance among the models.These models have been tested on completely new and randomized datasets.Here, we proposed a voting system among the models to achieve the generalization.The proposed method reported better results on the new dataset with accuracy of 83.5%.We also proposed a method to find the HRCT score of the CT scan to determine the severity of the person which helps in prognosis.For this, we employed U-Net architecture to segment the lung and infection masks.We achieved a Dice coefficient of 0.83 for lung mask and infection mask segmentation is higher compared other works.Further, the accuracy of classification can be improved by combining more diverse datasets, which will be explored in future.

Fig. 1
Fig. 1 CT scan image for segmentation dataset

Fig. 3 U
Fig. 3 U-Net architecture number of pixels in both images.The Dice coefficient and intersection over union (IoU) are similar and have a positively correlated score of l.The IoU ranges on the scale of 0 to 1, where 1 indicates the most similarity between expected and ground truth, and 0 indicates otherwise.After training and testing the model for infection mask segmentation, we achieved a Dice score of 0.88 for training set and 0.83 for testing set.Moreover, for lung mask

Fig. 4
Fig. 4 Accuracy and loss of Inceptionv3 model

( 4 )
Area of lung infection = X Y ∀ lobes

Table 1
Performance analysis