Abstract

Sciresol

https://indjst.org/author-guidelines

Indian Journal of Science and Technology

0974-5645

10.17485/IJST/v14i2.2137

Enhanced segmentation network with deep learning for Biomedical waste classification

Mythili

mythilitphd@gmail.com 1

Anbarasi

Ph.D. Research Scholar, Department of Computer Science, L. R. G. Government Arts College for Women

Tiruppur, Tamilnadu

India Assistant Professor, Department of Computer Science, L. R. G. Government Arts College for Women

Tiruppur, Tamilnadu

India

14 2

2021

Abstract

Objective: To maximize the accuracy of classifying the medical wastage, an Enhanced Segmentation Network (EnSegNet) with Deep Neural Network-Trash Classification (EnSegNet-DNN-TC) is proposed in this article. Methods: Initially, a core trainable segmentation network called SegNet framework is proposed which uses the Encoder-Decoder Network (EDN) and a pixel-wise classification layer for image segmentation. The decoder is used to upsample its low-resolution input feature maps via max-pooling. Also, SegNet uses fewer parameters for training. The uncertainty inherent to the EDN is modeled by the Bayesian functions to segment the input images. But, this SegNet can sample a limited amount of pixels in the images. Hence, an EnSegNet is proposed that uses Content-Sensitive Sampling (CSS) to sample more pixels in the data-sparse regions and fewer pixels in data-dense regions. Once the segmentation is completed, the DNN is applied for classifying the wastage using the segmented images. Findings: The experimental results show that the EnSegNet-DNN-TC framework achieves 88% accuracy compared to the DNN-TC for considering 100 images of different categories of biomedical wastes from the trash image dataset.

Keywords Biomedical wastage classification deep learning image segmentation ResNext encoder-decoder network

None

Introduction

Biomedical wastage normally creates from human, animal healthcare, medical training and research, biological laboratory wastage and other facilities. Part of the wastage stream is contagious or possibly harmful and should be carefully handled to protect health and sanitation workers. Typically, biomedical wastage are regulated and controlled based on different standards and protocols in various nations. In healthcare applications, the wastage are produced during inappropriate management which causes a direct health impact on the public, the atmosphere and the healthcare personnel. Biomedical wastage are a dangerous health hazard to the community, hospital, healthcare units, flora and fauna of the region. It should be accumulated in^{1, 2, 3}.

Nearly 75-90% of the biomedical wastage is non-dangerous and as harmless as any other wastage. The rest 10-25% is dangerous and may be harmful to humans or animals or the atmosphere. The Government of India states that biomedical wastage is a part of hospital hygiene and maintenance activities. The World Health Organization (WHO) has classified biomedical wastage into different types such as common wastage, contagious or hazardous wastage, radioactive, chemical, pathological, pressurized containers and drugs. Also, a series of training modules on better practices have been developed by the WHO in biomedical wastage management covering all features of wastage management activities from detection and classification of wastage for directing their secure disposal using both non-incineration and incineration policies.

In recent decades, the classification of biomedical wastage has been interested which is a promising application of computer vision. One of the easiest methods for automatically classifying the wastage or trash is deep learning algorithms that can detect and classify the wastage by using the images ⁴. Many Convolutional Neural Network (CNN) frameworks such as ResNext, ImageNet, VGG, ResNet, MobileNet, DenseNet and RecycleNet ⁵ have been available for biomedical wastage classification process using images. Among those algorithms, ResNext was the best framework for Transfer Learning (TL) to categorize the trash.

This ResNext framework has been used by Vo et al. ⁶ to design a DNN-TC framework that automatically classifies the trash in smart wastage sorter machines. At first, the trash image dataset was collected which comprises many images belonging to various classes: organic, inorganic and medical wastage from Vietnam. Then, a DNN was applied which was an enhancement of ResNext for increasing the classification accuracy. The standard ResNext-101 was modified by adding two Fully Connected (FC) layer for reducing the redundancy. In the data preprocessing step, the brightness of input images was normalized. After that, horizontal flip and random crop methods were applied to the input images for generating more images in the training and testing. During the training process, the pre-trained weight was loaded from the actual ResNext-101 on the ImageNet dataset. Then, the fine-tuned process was performed for learning the features of wastage from the trash dataset and the framework with the best accuracy was chosen by estimating the testing dataset for classifying the final output of each input image. Here, the confidence for each class was computed by the log softmax function in the last layer. Though it achieves the best accuracy, a segmentation technique was required for preprocessing the input images and further improving the efficiency of trash or wastage classification.

Therefore, in this article, an EnSegNet-DNN-TC framework is proposed for increasing the performance of the wastage classification. Initially, a core trainable segmentation network called SegNet framework is proposed for preprocessing the input images. It has the EDN which is topologically equal to ResNext-101 architecture and a pixel-wise classification layer. The decoder mainly upsamples its input feature maps by max-pooling. Also, it uses the reduced number of parameters for training. Moreover, the uncertainty inherent to the EDN is modeled via the Bayesian functions for segmenting the input images. But, it can a sample limited amount of pixels in the images. As a result, an EnSegNet is proposed that uses CSS to sample more pixels in the data-sparse regions and fewer pixels in data-dense regions. Thus, this EnSegNet is learned to use the sampled pixels for segmenting the image into data-sensitive super-pixels. Then, the segmented image is fed to the DNN for efficiently classifying the trash.

The rest of the article is prepared as follows: Section 2 studies the researches related to the wastage classification. Section 3 describes the functioning of EnSegNet-DNN-TC and Section 4 portrays its performance. Section 5 summarizes this research work and suggests future scope.

Literature Survey

Kennedy ⁷ proposed an OscarNet using TL for classifying the disposable wastage. In this model, a large CNN was pre-trained for the ImageNet process. Also, the FC layers were removed and a single hidden dense layer was added for classifying the images of disposable wastage into different types. However, it was not suitable for training features of multiple large CNNs simultaneously. Also, the decoding time was high due to the high dimensionality of the feature maps.

Chu et al. ⁸ proposed a Multilayer Hybrid deep-learning System (MHS) for automatically sorting the wastage disposed by individuals in the urban regions. First, the wastage images were acquired and fed to CNN for extracting the image features. Also, a Multi-Layer Perceptron (MLP) method was used to consolidate images and other features for classifying the wastage as recyclable or others. But, its efficiency was poor when wastage items lack distinctive image features.

Aral et al. ⁹analyzed different deep learning models such as DenseNet, InceptionResNet, MobileNet and Xception structures for classifying the Trashnet dataset. Here, Adam and Adadelta were applied as the optimizer in these network structures. But, the accuracy rate was not effective in real-time systems because of a comparatively small amount of data and white background of the images.

Adedeji & Wang ¹⁰ proposed an intelligent wastage classification by ResNet. Here, Support Vector Machine (SVM) was used rather than the FC layer and optimized by the radial basis kernel for classification. But, the accuracy was not effective. Sousa et al. ¹¹ suggested a hierarchical Faster Region-based CNN (FR-CNN) for identifying and classifying the wastage in food trays. Also, a novel dataset called labeled wastage in the wild was collected and annotated for classification. However, the mean average precision was less and the complexity was high.

Xue et al. ¹² proposed CNN for realizing the fast analysis of fertilizer via evaluating different fertilizing phase images. Here, images of various fertilizing ingredients were gathered for constructing the dataset which was classified by CNN. But, the training was complex while increasing the network layer numbers and parameters. Mazloumian et al. ¹³ recommended DNN for classifying the food wastage using preprocessing and classification. The preprocessing was used for enhancing the images via scaling, background subtraction and Region-Of-Interest (ROI) cropping. Then, deep CNN was employed to classify the wastage. But, the accuracy was less.

Toğaçar et al. ¹⁴ designed an auto-encoder with integrated feature selection in CNN for categorizing the wastage. First, the dataset used for the classification of wastage was reconstructed with the auto-encoder network. Then, the feature sets were extracted and fused using CNN. Also, the ridge regression was applied on the fused feature set to reduce the number of features and SVM was used for classification. But, it was not suitable for multi-class datasets.

Nowakowski & Pamula ¹⁵ proposed a new method for classifying and identifying the e-wastage. In this method, CNN was applied for classifying the types of e-wastage whereas FR-CNN was used for identifying the type and size of the wastage equipment in the images. Once the size and types of wastage were automatically classified and identified from the images, a collection plan was prepared by the e-wastage collection organizations via allocating the adequate amount of vehicles and payload capacity for a specific e-wastage project. However, complexity was high while using large-scale datasets.

A multi-level approach ¹⁶ was introduced for segmenting the waste objects. First, the scene-level segmentation was applied to capture the long-range spatial contexts and create a primary coarse segmentation. Then, few possible object areas were chosen by the coarse segmentation and an object-level segmentation was performed. After, the scene and object-level outcomes were combined into a pixel-level FC conditional random field for generating the coherent final localization. But, its robustness was less while performing on multiple datasets with large object appearance.

Proposed Methodology

In this section, the EnSegNet-DNN-TC framework is explained in detail. Generally, SegNet framework is stimulated by the unsupervised feature learning structure. The core training unit is EDN. The encoder encompasses the convolution with filters, pixel-wise tanh non-linearity, max-poling and sub-sampling for obtaining the feature maps. The highest feature maps in the encoder are accumulated and transferred to the decoder which upsamples them by the accumulated combined variables. Then, the actual image is restored via convolving the upsampled maps.

3.1 Design of SegNet framework

Typically, SegNet comprises the EDN and the pixel-wise classifier. Its major parts are shown in Figure 1. It is only the convolution conv layer since no FC layers exist. The decoder can upsample its input via the max-pooling for generating the sparse feature maps. After, conv with the filters is performed for densifying the feature maps. Moreover, the resultant decoder feature maps are given to the softmax for segmenting the images in a pixel-wise manner.

The encoder involves 13 conv layers similar to the VGG16 ¹⁷ and so the training process can be initialized from the weights learned to segment and classify the huge amount of images. For retaining high-resolution feature maps and minimizing the number of training parameters, the FC layers are removed.

Figure 1 <bold id="strong-191f19568fd74a1dab2103868dca8e73"/>Architecture of SegNet framework

Each encoder has a compatible decoder so that there are 13 layers in the decoder. The resulting decoder outcome is given to the multi-class softmax classification to create separate class likelihoods for all pixels. The group of feature maps is generated by conv with the filters in the encoder.

After that, these are batch regularized using an element-wise Rectified Linear Unit (ReLU) max0,x. Next, max-pooling with a non-overlapping window is employed to sub-sample the input image. Before this task, the edge details are estimated to reduce the loss of spatial resolution.

After sub-sampling, all feature maps of the encoder are generated while storage is not restricted. But, it is not applicable in real-time uses. So, an efficient method is used for collecting only the highest feature values in every pooling window. A suitable decoder upsamples its input feature maps by the highest feature values obtained in the respective encoder feature maps.

The decoding method of SegNet is shown in Figure 2 wherein a,b,c and d are the values in the feature map. Typically, it utilizes the max-pooling for upsampling the feature maps and convolving them with decoder filters.

Figure 2 <bold id="strong-4af36781d4ae4369b8dda5dc7baee823"/>Design of SegNet decoder

In this method, sparse feature maps are generated and convolved using the decoder filters for generating the dense feature maps. After that, batch regularization is used on every map. Here, the decoder compatible with the primary encoder generates a multi-channel feature map whereas the remaining decoders generate the feature maps with an equal amount of dimension and channels in their encoder. The outcome of the resultant decoder is given to the softmax which segments all pixels separately according to their likelihoods. But, it can sample a limited amount of pixels in the images. As a result, an EnSegNet is proposed that uses CSS to sample more pixels in the data-sparse regions and fewer pixels in data-dense regions.

3.2 EnSegNet framework using CSS

A measurement of content-sensitiveness ConSen is introduced for producing the content-sensitive superpixels. It defines the superpixel’s dimension must be responsive to the deviation of the data in the super-pixel. So, the ConSen of a super-pixel is measured by the fraction of the color deviation in the super-pixel S to the size of it.

1 ConSen S=∑i=1KMpi|S|

In Eq. (1), K stands for the number of pixels in S, pi denotes ith pixel in S and Mpi denotes the color deviation of pi which is determined in horizontal and vertical orders. Here, S is a set of grouped homogeneous pixels in an image. For both orders, 2 positive and 2 negative elements are considered. Consider the pixel Px0,y0 whose color is cx0,y0, the window dimension around P is 2s+1×2s+1. Assume H1- and H2- are the negative elements in horizontal order of P, H1+ and H2+ are the positive elements in horizontal order of P.

2H1-=1ss+1∑y=y0-sy0+1∑x=x0-sx0+1cx,y

3H1+=1ss+1∑y=y0-sy0+1∑x=x0-1x0+scx,y

4H2-=1ss+1∑y=y0-1y0+s∑x=x0-sx0+1cx,y

5H2+=1ss+1∑y=y0-1y0+s∑x=x0-1x0+scx,y

Here, cx,y is the color of a pixel at position x,y. Likewise, the color deviation is obtained by denoting V1-,V1+,V2- and V2+, accordingly. After that, the color deviation of P is defined as:

6M(P)=ΔH12+H22+ΔV12+ΔV22

7 Where ΔH1=H1+-H1-

8∆H2=H2+-H2-

9∆V1=V1+-V1-

10∆V2=V2+-V2-

To use the density matching property of SegNet-DNN-TC, the CSS is proposed for generating the content-sensitive superpixels. So, it produces huge clusters having the number of components while increasing the number of training images and smaller clusters having some components while using fewer amounts of images. Thus, the major aim of CSS is that numerous pixels have to be sampled in data-sparse areas and lesser pixels in data-dense areas. So, a likelihood of sampling pis defined as:

11ℒ(p)=1-M(p)Max(I)

In Eq. (11), MaxI is the highest deviation of color for each pixel, Mp is the color deviation of p. A larger Lp which signifies pixels in data-sparse area needs to be sampled.

Table 0

Algorithm for EnSegNet-DNN-TC framework
Input: Image set
Output: Classified biomedical wastages classes
	Initialize;
	for( each input image )
		Perform the encoder of EnSegNet;
		Execute the decoder of EnSegNet;
		for( each pixel in image )
			Calculate the color variation of each pixel using Eq. (6);
			Compute the likelihood of each pixel being sampled via Eq. (11);
			Sample pixels in the data-sparse regions;
		end for
		Train the EnSegNet in end-to-end manner;
		Obtain the segmented images;
		Apply DNN classifier;
		Find the category of biomedical wastages;
	end for
	End

So, the input image for the training dataset is segmented and fed to the DNN ⁵ classifying the biomedical wastage efficiently. Figure 3 depicts the overall flow diagram of the EnSegNet-DNN-TC framework.

Figure 3 Flow diagram of EnSegNet-DNN-TC framework Experimental Results

In this section, the effectiveness of EnSegNet-DNN-TC is analyzed and compared with the DNN-TC framework by using MATLAB 2017b. In this experiment, a trash image dataset is collected which consists of 200 images of different categories of biomedical wastage: infectious waste, chemical waste, sharp waste, pharmaceutical waste and pathological waste. Infectious wastes include blood-soaked bandages, discarded surgical gloves and masks, cultures, stocks or swabs.

The chemical wastes are various types of chemicals used in the production of biologicals, cleansing, etc. Sharp wastes are needles, syringes, scalpels treatment, autoclaving or micro blades, glasses and so on. These may cause waving and mutilation shredding puncture and cuts. Similarly, pharmaceutical wastes can be the site of spills, half-used bottles, IV equipment with residual medicine on it. The pathological wastes include the materials eliminated from the body in surgery and fluids as well as solids removed in autopsies except teeth. From this dataset, 100 images are taken for training and the remaining 100 are for testing. The comparison is carried out based on precision, recall, f-measure, accuracy, error rate and Root Mean Squared Error (RMSE). Figure 5 portrays the samples of the considered trash image dataset.

Figure 0 Figure 4 Example samples of considered trash image Dataset

Figure 7 portrays the experimental results of segmented images with their respective input images for the EnSegNet-DNN-TC framework.

Figure 0 Figure 5 Samples of original images and segmented images for EnSegNet-DNN-TC Framework 4.1 Precision

It is measured according to the amount of correctly classified biomedical wastage at True Positive (TP) and False Positive (FP) rates.

12Precision=No. of correctly classified medical wastesNo. of correctly classified medical wastes+No. of wrongly classified medical wastes

Figure 6 <bold id="strong-a287d602370f4ba88c3a0b89eab91d5b"/>Comparison of precision

Figure 8 depicts the precision for EnSegNet-DNN-TC and DNN-TC frameworks under the different number of images. This analysis indicates the precision of EnSegNet-DNN-TC for 100 images is 3.05% increased as compared to the DNN-TC. Thus, it is concluded that the EnsegNet-DNN-TC can increase the precision to classify the biomedical wastes while increasing the number of images in the dataset.

4.2 Recall

It is measured according to the classification of the biomedical wastes at TP and False Negative (FN) rates.

13Recall=No. of correctly classified medical wastesNo. of correctly classified medical wastes+No. of wrongly classified non_medical wastes

Figure 7 <bold id="strong-1febadaf535f43cbb1524fbc1029d10f"/>Comparison of recall

In Figure 9, the recall for EnSegNet-DNN-TC and DNN-TC frameworks with a varying numbers of images are depicted. This analysis observes the recall of EnSegNet-DNN-TC for 100 images is 5.01% maximized as compared to the DNN-TC. So, it is concluded that the recall of EnSegNet-DNN-TC can be increased while increasing the number of input images.

4.3 F-measure

It is computed as the harmonic average of precision and recall.

14F-measure=2×Precision∙RecallPrecision+Recall

Figure 8 <bold id="strong-8d4bcebf89c447b9b93664f5f26dde0e"/>Comparison of F-measure

Figure 10 portrays the f-measure for EnSegNet-DNN-TC and DNN-TC frameworks under different amounts of images. This analysis indicates the f-measure of EnSegNet-DNN-TC for 100 images is 3.92% improved as compared to the DNN-TC. Thus, it is concluded that the EnSegNet-DNN-TC can increase the f-measure for classifying the biomedical wastage efficiently.

4.4 Accuracy

It is the fraction of accurate classification of medical wastage over the total number of trials performed.

15Accuracy=TP+True Negative (TN)TP+TN+FP+FN

In Figure 11, the accuracy (%) for EnSegNet-DNN-TC and DNN-TC frameworks with a varied number of images are portrayed. This analysis observes the accuracy of EnSegNet-DNN-TC for 100 images is 4.76% maximized as compared to the DNN-TC. So, it is concluded that the EnSegNet-DNN-TC can maximize the accuracy for biomedical waste classification with an increasing amount of images.

Figure 9 <bold id="strong-633ccf5e616e4b53b55ac7c775d1e6f5"/>Comparison of Accuracy

4.5 Error rate

It is calculated as:

16 Error rate =FP+FNTP+TN+FP+FN

Figure 10 <bold id="strong-7d7b37b33f414cdfb2c552df6dc2f1e4"/>Comparison of Error Rate

In Figure 12, the error rate for EnSegNet-DNN-TC and DNN-TC frameworks under a varying numbers of images are shown. This analysis indicates the error rate of EnSegNet-DNN-TC for 100 images is 24.22% reduced as compared to the DNN-TC. Thus, it is observed that the EnSegNet-DNN-TC can minimize the error rate while increasing the number of images for classifying the biomedical wastes.

4.6 RMSE

It is also a measure of the accuracy of segmentation. It is computed by taking the square root of MSE value as:

17RMSE=1N∑i∑jSij-Iij2

In Eq. (17), N is the total amount of images, S is the segmented image, A is an actual image and i, j are pixels in the images.

Figure 11 <bold id="strong-4d6d6d53403c43c5bc2241fd7bd82a69"/>Comparison of RME

Figure 13 depicts the RMSE for EnSegNet-DNN-TC and DNN-TC frameworks with the different numbers of images. This analysis notices the RMSE of EnSegNet-DNN-TC for 100 images is 7.43% minimized as compared to the DNN-TC. So, it is concluded that the EnSegNet-DNN-TC can reduce the RMSE when the number of images is high for classifying the biomedical wastage.

Conclusion

In this article, an EnSegNet-DNN-TC framework is proposed to increase the accuracy of wastage classification. At first, a SegNet is designed in which EDN uses max-pooling to upsample the input feature maps. As well, its uncertainty to segment the images is measured via Bayesian operators. But, it samples a limited amount of pixels. So, an EnSegNet is developed which applies CSS to sample more pixels in the data-sparse regions and fewer pixels in data-dense regions. Once the segmentation is completed, the DNN is applied for classifying the wastage. To conclude, the experimental outcomes proved that the EnSegNet-DNN-TC achieves 83.76% mean accuracy and 0.138 mean error rate compared to the DNN-TC. Though it extracts features sufficiently, there are subtle variances between different images and misjudgments due to its high complexity. Hence, the future work of this research work includes the fusion of deep features and texture features to prevent the misjudgments of EnSegNet-DNN-TC using complex background images.

References

Ilyas

Sadia

Srivastava

Rajiv Ranjan

Kim

Hyunjung

Disinfection technology and strategies for COVID-19 hospital and bio-medical waste management

Science of The Total Environment 2020 749 0048-9697

Elsevier BV

https://dx.doi.org/10.1016/j.scitotenv.2020.141652

Datta

Priya

Mohi

Gursimran

Chander

Jagdish

Biomedical waste management in India: Critical appraisal

Journal of Laboratory Physicians 2018 10 01 006 014 0974-2727, 0974-7826

Georg Thieme Verlag KG

https://dx.doi.org/10.4103/jlp.jlp_89_17

Chauhan

Singh

A hybrid multi-criteria decision making method approach for selecting a sustainable location of healthcare waste disposal facility

Journal of Cleaner Production 2016 139 1001 1010 https://doi.org/10.1016/j.jclepro.2016.08.098

Achuthan

Madangopal

V A

A bio medical waste identification and classification algorithm using MLTRP and RVM

Iranian Journal of Public Health 2016 45 10 1276

Bircanoğlu

Atay

Beşer

Genç

Kızrak

M A

Recyclenet: Intelligent waste sorting using deep neural networks

IEEE Innovations in Intelligent Systems and Applications 2018 1 7 https://doi.org/10.1109/INISTA.2018.8466276

A H

M T

A novel framework for trash classification using deep transfer learning

IEEE Access 2019 7 178631 178639

Kennedy

Stanford University

OscarNet: using transfer learning to classify disposable waste

CS230 Report: Deep Learning 2018

Chu

Huang

Xie

Tan

Kamal

Xiong

Multilayer hybrid deep-learning method for waste classification and recycling

Computational Intelligence and Neuroscience 2018 https://doi.org/10.1155/2018/5060857

Aral

R A

Keskin

Ş R

Kaya

Hacıömeroğlu

Classification of trashnet dataset based on deep learning models

IEEE International Conference on Big Data 2018 2058 2062 https://doi.org/10.1109/BigData.2018.8622212

Adedeji

Olugboja

Wang

Zenghui

Intelligent Waste Classification System Using Deep Learning Convolutional Neural Network

Procedia Manufacturing 2019 35 607 612 2351-9789

Elsevier BV

https://dx.doi.org/10.1016/j.promfg.2019.05.086

Sousa

Rebelo

Cardoso

J S

Automation of waste sorting with deep learning

IEEE XV Workshop de Visão Computacional 2019 43 48 https://doi.org/10.1109/WVC.2019.8876924

Xue

Wei

Mei

Chen

A fast and easy method for predicting agricultural waste compost maturity by image-based deep learning

Bioresource Technology 2019 290 https://doi.org/10.1016/j.biortech.2019.121761

Mazloumian

Rosenthal

Gelke

Deep learning for classifying food waste

2020

Toğaçar

Mesut

Ergen

Burhan

Cömert

Zafer

Waste classification using AutoEncoder network with integrated feature selection method in convolutional neural network models

Measurement 2020 153 0263-2241

Elsevier BV

https://dx.doi.org/10.1016/j.measurement.2019.107459

Nowakowski

Piotr

Pamuła

Teresa

Application of deep learning object classifier to improve e-waste collection planning

Waste Management 2020 109 1 9 0956-053X

Elsevier BV

https://dx.doi.org/10.1016/j.wasman.2020.04.041

Wang

Tao

Cai

Yuanzheng

Liang

Lingyu

Dongyi

A Multi-Level Approach to Waste Object Segmentation

Sensors 2020 20 14 3816 3816 1424-8220

MDPI AG

https://dx.doi.org/10.3390/s20143816

Rezende

Ruppert

Carvalho

Theophilo

Ramos

Geus

P De

Malicious software classification using VGG16 deep neural network’s bottleneck features

Information Technology-New Generations 2018

Springer

Cham

51 59 https://doi.org/10.1007/978-3-319-77028-4_9