Design of Shallow Neural Network Based Plant Disease Detection System

— In this work, we proposed the use of a shallow neural network for plant disease detection. The study focuses on four major diseases that are known to attack some of the most cultivated crops globally. The diseases considered include Bacterial Blight, Anthracnose, Cercospora leaf spot and Alternaria Alternata. In developing the disease detection model, K-means algorithm was used for plant segmentation while color co-occurrence method was used for feature analysis. A shallow neural network trained on 145 training samples was used as a classifier. The detection accuracy of 98.34 %, 98.48%, 98.03% and 98.14% were recorded for Bacterial Blight, Anthracnose, Cercospora leaf spot and Alternaria Alternata diseases respectively. The overall detection accuracy of the model is 98.25%.


I. INTRODUCTION
Food security is an important problem in most nations of the world. Estimates show that up to 40% of annual global production of the top five most cultivated crops of wheat, rice, maize, potato and soybean is lost to disease and pest [1]. Further data revealed in [1] showed that about 10 to 28% of wheat, 25 to 41% of rice, 20 to 41% of maize, 8 to 21% of potato and 11 to 32% of soybean cultivated globally is lost to pest and pathogen. In the US alone, about $21 billion is lost annually to crop pathogens [2]. In Africa and most developing regions of the world, the statistics and impact is even more staggering due to the smaller size of the economy. For instance in Uganda between 1992 and 1997, a pandemic size cassava blight disease led to an annual loss of over $60 million [3]. The effect of this loss is not only economic but also ushered severe famine, since cassava is a major staple food in Uganda. Aside this, nearly 500 local cassava plant genotypes required extensive protection as they were pushed to near extinction by the outbreak. Tomato (Lycopersicon esculentum) production in Nigeria recorded a similar experience between 2016 and 2017 when Tuta Absoluta popularly known as Tomato leaf miner broke out in Northern Nigeria, leading to over 80% loss of all tomato plantation and more than 400% hike in the cost of tomato in the country [4]. Another source of concern is the rapid rate at which plant diseases tend to spread. The Spreading rate of the Uganda outbreak was estimated to be between 20 to 50 kilometer per year, consequently initiating similar infection in neighboring countries like Sudan, Kenya, Tanzania and Congo thus sparking up yield losses of up to 95% in some of these countries [3].
Detecting plant disease is no small feat. Most farmers particularly in sub-Saharan Africa largely rely on the "eye observation approach". This is however, a very unreliable way of disease detection, as the risk of wrong diagnosis remain very high even among trained and experienced plant pathologists. Wrong diagnosis oftentimes will lead to wrong treatment application, which will invariably lead to waste of time, loss of crop and famine ultimately. Since Agriculture is a big deal globally and a much bigger deal in Africa since a large percentage of the continent young population are farmers. In fact, authors in [5] reported that over 60% of the entire population of people in sub-Saharan Africa are farmers and that farming activity accounts for roughly 23% of the region GDP. Globally, study carried out in [6] shows that about 2 billion people, which is roughly 26.7% of total world population, derive their daily livelihood from agriculture.
The vast amount of people involved in agriculture and the overarching effect a colossal loss of agricultural product and revenue to pest and diseases will have, makes it an important topic of study. Tackling this problem begins with identifying the plant disease accurately and timely which remains the central aim of this study. To achieve this, we developed an artificial neural network based plant disease detection system that is not only able to detect certain diseases and provide early warning to the farmer. We focus on the four major diseases affecting the top six most cultivated crops in Africa (Rice, maize, wheat, cassava, potato and soybeans). The diseases of study in this research are Bacterial Blight, Anthracnose, Cercospora leaf spot and Alternaria Alternata.
In this work, we specifically demonstrate the capability of a shallow neural network when trained with limited training data. To develop such system capable of recognizing any of the four plant diseases being studied on only a small amount of data, first, we obtained samples of both diseased and healthy plants. The obtained samples were labelled accordingly and processed. K-means algorithm was then used for segmentation while co co-occurrence method was used to provide feature characterization of the plant. A total of thirteen different features of the plants were extracted and used to train a classifier. For this end, an artificial neural network was used. The network was trained on a total of 145 samples and result reported accordingly. II. RELATED WORKS Plant diseases detection system has been the focus of several research effort since it is an important global problem. Authors in [7], [8] has already established that the traditional method of relying on eye observation to detect plant diseases is ineffective and unreliable to say the least. A scientific approach was taken in [9] to detect diseases in the leaf and stem of plants. The authors employed the use of k-means clustering algorithm for segmentation and extracted features of each segment using the Color Co-occurrence Method (CCM) by using the Spatial Gray-level Dependence Matrices (SGDM). Neural network was trained on a back propagation algorithm with an overall classification accuracy of 93% reported. An improvement on the processing speed of the work done in [9] was proposed in [10]. By introducing additional processing of masking the green colored pixels after segmentation, the authors were able to record a 20% improvement in the processing speed of the improved system. The work presented in [11] studied the use of image edge detection segmentation technique to detect and evaluate leaf spot disease in cotton. Classification based on the R, G, B color feature of the segmented area was carried out with a neural network classifier. It must be noted that the develop system is restricted to leaf spot disease in cotton. Neural networks was also used in [12] for classification of diseases in grape leaves. Texture features of unhealthy region of plant leaves was used in a disease detection and classification algorithm developed in [13]. Both minimum distance criterion and support vector machine were used as classifier with a detection accuracy of 94% reported for training done over a database of about 500 plant leaves. A similar study on detecting unhealthy region of plants was carried out in [14] only that in this case, the authors employed the use of genetic algorithm to group the unlabeled dataset pixel points in to clusters. By using both minimum distance criterion alongside genetic algorithm, the authors were able to achieve a system accuracy of 96.63%. In [15], an algorithm to detect diseases in fruits using sum and difference histogram obtained from fruit images was developed.
In majority of these listed studies, the shallow neural network was used as opposed the deep neural network. The implication of this is the algorithm only tend to perform well on only a single plant or related plants. For instance, the study carried out in [16] is restricted to only pepper while [17] is exclusive to pomegranate.
More complex plant disease identification and classification algorithm have always have to result to deep neural networks, particularly the convolution Neural Network. This kind of networks are able to make generalization that is more complex but require extensive training data. Authors in [8] developed a deep learning model for plant diseases detection. To achieve this, a convolutional neural network model trained on a database of 87,848 images was used, thus allowing for a disease detection rate as high as 99.53%. Similarly, the Soybean plant disease identification system proposed in [18] requires 12,673 leaf images to achieve a classification accuracy of 99.32%. A much lower success rate of 87% was reported in [19] because in this case, training model containing 3663 images of apple and tomato leaf was used on the CNN model. It is however important to note that several studies on plant disease detection relies on the use of carefully processed image sets available in open databases. These images are oftentimes perfect and does not reflect the ideal farm condition where other materials like soil, insects will likely form be reflected in the images. The authors in [19] noted this fact and reported that 37.3% of all images used in their CNN training were captured in cultivation conditions.
In this study, however, we propose the use of a shallow neural network trained on a small data set that is exclusively obtained in a cultivation environment. In addition, we restricted our scope to the four key diseases that is known to affect the top five cultivated crops particularly in sub-Saharan Africa and parts of east Asia.

III. DESIGN METHODOLOGY
The procedure adopted for the realization of the objectives of this research is highlighted in Fig. 2. The processes adopted include image acquisition, image enhancement, image segmentation, feature analysis and image classification.

A. Image Acquisition and Enhancement
Image acquisition for this study was done using a digital camera. The images unlike most study in image processing that relies extensively on carefully prepared training data from large dataset databases, the images employed for the training of the model developed in this study were taken from the farm. The images were then cropped as appropriate. Enhancement was also carried out to increase the picture contrast. Finally, color conversion was done in preparation for the image segmentation process and greying to enhance the grey level co-occurrence matrix conversion. The color conversion was accomplished using (1).
( ) = 0.2989 + 0.5870 + 0.114 Thereafter, histogram equalization, which facilitates the equal distribution of the intensities of the images, is applied on the image set to further emphasize the plant diseases before a cumulative distribution function is used to distribute image intensity values.

B. Image Segmentation
In segmentation, we partition the images into various part that shares similar features. For this study, K-means was used for image segmentation. Images were assigned into any of four clusters, similar to the work carried out in [10]. To accomplish this, first, the center of K cluster was randomly picked. Secondly, each pixel in the image was assigned to one of the four clusters that minimizes the distance between individual pixel and the cluster center. In the third step, the cluster center is computed again by finding the average of all the pixels in the cluster. The second and third step iterated until there is a convergence.

C. Feature Extraction and Analysis
Since the accuracy of image classification is largely dependent on the feature set selection, this work uses the Color Co-Occurrence method (CCM). In this method, both the color and texture features of the dataset are used to parametrize the image. The spatial gray level dependency matrices (SGLD) was used to model the relationships between individual pixels within each region of the image cluster in terms of distance d and angle 'θ'. A more elaborate description of this process was given in [20]. For each SGLD matrix, all 13 components of contrast, correlation, variance, inverse different moment, sum average, sum variance, sum entropy, difference variance, difference entropy, information measure of correlation I, information measure of correlation II, and maximal correlation coefficient were all extracted as seen in [21].
For the color analysis, first the RGB image was converted into * * * color space. K-means algorithm was then used to classify the images into the different LAB components by iteratively using the Euclidean distance. Thereafter we created the gray level co-occurrence matrix and consequently extract features for labelling.

D. Neural Network Classification
Classification in this work was accomplished with the use of neural network. A single layer neural network with 20 neurons was trained using scaled conjugate gradient feedforward backpropagation algorithm. In all, a total of 145 training samples were used to train the shallow neural network. The training data were divided into three. 70% of the data was used for training, 15 for testing and the remaining 15% for validation. The neural network was used to form a generalized relationship between the input and output data [22]. The neural network performs this task by using the means square error to iteratively find an optimal solution that minimizes the cost function [23].

IV. RESULTS AND DISCUSSION
For ease of usage, we developed a graphical user interface for this work as shown in figure 5. We tested the design on the four diseases that were focused on in this study. Four leaves samples each of the Bacterial Blight, Anthracnose, Cercospora leaf spot and Alternaria Alternata infected leaves were obtained after being identified by trained plant pathologist. The leaves used have varying degree of infection ranging from as low as 15% to almost 100 %. The test for each leaf was repeated multiple times and test average reported.

A. Bacterial Blight Disease Test
The result for the Bacterial Blight disease test is shown in fig. 6. Table 1 shows the average result for different samples of leaves used to test the model. For the result, the average detection rate is 98.34%. The detection accuracy is also relatively stable even with wide variation in the affected regions of the infected leaves used for the testing.

B. Anthracnose Test
The Anthracnose disease was the second disease tested for in this model. The detection of the disease by the model can be seen in Fig. 7. Table II Fig. 9. Shows the testing for the Alternaria Alternata plant disease. The result presented in Table IV shows that the detection accuracy of the model is fairly consistent irrespective of the percentage of the plant part that is affected by the disease. The average detection rate for testing carried out on this disease is 98.14%.  Economic lossess resulting from plant diseases is an important problem in agriculture. This problem often leads to scarcity and high cost of food particularly in developing nations. Eye observation for disease identification has been the most common method. This method has been found to be unreliable and ineffective. As a way of developing a more reliable method for disease identification, we developed a new model. This model detects four common plant diseases that are known to provide significant threats to five of the most cultivated plants in sub-Saharan Africa. These diseases are Bacterial Blight, Anthracnose, Cercospora leaf spot and Alternaria Alternata. Digital cameras were employed in taken real life images of the diseased plants of interest after identification by trained pathologist. K-means algorithm was used for segmentation and CCM for feature extraction. A single layer shallow neural network trained on 145 dataset was used as a classifier. The detection accuracy for Bacterial Blight disease is 98.34. For Cercospora leaf spot, the average detection accuracy is 98.03. For Anthracnose and Alternaria Alternata, the detection accuracy is 98.48 and 98.14%, respectively. To extend the functionality of this work, IoT [24] can be employed just as in [25].

ACKNOWLEDGMENT
Our gratitude goes to Prof. S.A Oyetunji and Mr. Emmanuel for their immense contribution to this work. We also wish to thank the Federal University of Technology for their immense support.