Geometrical-based approach for robust human image detection

In recent years, object detection and classification has been gaining more attention, thus, there are several human object detection algorithms being used to locate and recognize human objects in images. The research of image processing and analyzing based on human shape is one of the hot topic due to the wide applicability in real applications. In this paper, we present a new object classification approach. The new approach will use a simple and robust geometrical model to classify the detected object as human or non-human in the images. In the proposed approach, the object is detected. Then the detected object under different conditions can be accurately classified (i.e. human, non-human) by combining the features that are extracted from the upper portion of the contour and the proposed geometrical model parameters. A software-based simulation using Matlab was performed using INRIA dataset and the obtained results are validated by comparing with five state-of-art approaches in literature and some of the machine learning approaches such as artificial neural networks (ANN), support vector machine (SVM), and random forest (RF). The experimental results show that the proposed object classification approach is efficient and achieved a comparable accuracy to other machine learning approaches and other state-of-art approaches.


Introduction
Human object detection is one of the most active research topics in computer vision.Human objects detection can be simply defined as the process of localizing all objects that are human in the images by detecting human features [9].For a robust detection of human object in images, we need the power of computer vision algorithms to be able to extract common features among different objects (i.e.Human and Non-human objects).This task has become a quite challenging for researchers in computer vision area due to the fact that different objects tend to have different features which are usually used for object recognition [17,33,44].
Generally, the human detection algorithm considers as a process based multi-tasking which means two or more processes can be run concurrently such as detect the human, motion and behavior detection (i.e.Normal, and abnormal behavior), recognize the person based on his face (i.e.Face recognition), and counting the number of humans in images [5,18,35,42].Therefore, human detection algorithms may use in a number of different applications such as security purpose in highly sensitive areas (i.e.airports, train stations, supermarkets, etc.) and so many surveillance application systems (i.e.driving assistant) [5].Despite of all the benefits, human detection algorithm is facing some challenges to detect the human form the images and videos such as dynamically changing in the background and illumination, and varying camera positions.Therefore, appropriate algorithms are needed to manage such these challenges to ensure the quality service of human detection algorithm (i.e.classification accuracy) [33,39].
Human detection topic in computer vision area has been adopted by many researchers.Most of these researches a high resolution direct face is required or need entire body (i.e.object) to be visible in the images and videos, and a huge database for matching based classification are required.Generally, there are four major categories of object detection methods, i.e. dynamic threshold, flow analysis, background subtraction, and temporal differencing [34].Moreover, there are four major categories of object classification: shapebased, texture-based, motion-based, and color-based [27].
Machine learning approaches have been widely used in classifying a detected object (i.e.human, non-human) [30].Machine learning is linked to artificial intelligence (AI), and it is designed using different approaches (i.e.Support Vector Machine model, Artificial Neural Network, Random Forest, etc.) in order to allow a computer to learn by finding statistical regularities (i.e.conditional probabilities) on a set of data.The statistical regularities are used to classify the detected object by means of previous experience, analysis, and self-training [46].
In this paper, we present a geometrical-based approach that can emulate the machine learning algorithms for classifying the detected object as human or non-human in images.The main idea of this approach is using proposed geometrical model parameters to classify the detected object based on some shape features which extracted from the object upper portion.Due the upper portion is always visible and not easy to cover or disappear.Furthermore, the proposed approach can potentially reduce the classification challenges such as illumination, varying camera positions, and human object pose.This paper will be arranged as follows.Section 2, is devoted to providing an extensive overview of the related work on human object detection algorithms and in Section 3, the proposed object classification approach is described and in Section 4, the experimental results of the proposed approach are presented and in Section 5, the evaluation processes of the proposed object classification approach are described and in Section 6, the performance analysis between the proposed object classification approach and some of machine approaches and five state-of-art approaches in literature are discussed.Finally, the conclusion is presented in Section 7.

Related works
In this section, we report an extensive literature on the algorithms and methods relevant to motion detection and classification methods, specifically, algorithms for detecting a human object.For accurate detection, the human object must be accurately detected using suitable methods.Many researchers have turned their attention to propose new approaches for human detections, but the new approaches have a number of practical problems, such as dynamically changing in the background and illumination, varying camera positions and human object pose, and clothing and texture parameters [15,16,32].
Zhong et al. [48] improved the performance of object detection and tracking (i.e.human, car, van, etc.) using a prototype-based deformable template models.The proposed method used a criterion which combines two terms: frame-to-frame deviations of the object shape, and the fidelity of the modeled shape of the input image to track an object in an image sequence.This method is fast and achieves better detection and tracking performance compared to other methods.
Desa et al. [13] proposed a new method for object detection, the proposed object detection method is a combination of two approaches i.e. background subtraction, and temporal differencing.The implementation results of the proposed method show improving the performance in detecting object compared to implement each approach separately.Sugandi et al. [38] utilized a robust approach for moving object detection and tracking based on temporal difference method on low resolution image.
Yuhua et al. [47] presented a new approach for human detect object in video.In this approach a set of parameters was built based on the interest object for the object classification process, and then using a set of sample data which consist of both negative and positive samples been fed to the machine learning classifier.The result shows a high accuracy for object classification, but the proposed approach requires a huge database for matching based classification.
Dalal and Triggs [12] employed a set of features for building a machine learning classifier to propose a robust visual object detection and recognition histogram-based algorithm.The proposed algorithm is done by dividing the image window into small spatial parts (i.e.Cells) and finding the histograms of edge orientations over all the pixels of the cell.The results of the conducted experiments showed a high accuracy for object classification, but the proposed algorithm introduces a more challenging dataset for machine learning classifier.
Al-Nawashi et al. [5] proposed a novel framework for an intelligent surveillance system based on abnormal human activity detection in academic environments.The conducted experiment showed an excellent surveillance system that can simultaneously perform the tracking, semantic scene learning, and abnormality detection in an academic environment with no human intervention.Xia Lu et al. [43] presented a high accurate approach that can recognize an object automatically based on an efficient and robust algorithm to identify temporal patterns among actions and utilize the identified patterns to represent activities.Liu Ye et al. [26] proposed a tracker sampling approach for generic human motion tracking using both low and high-dimensional trackers.
Cui Jinshi et al. [10] proposed a new framework for human motion tracking based on a fusion formulation which integrates low and high-dimensional tracking approaches into one framework.
Pablo et al. [6] proposed a new algorithm based on contour detector and image segmentation, the contour has been detected using spectral clustering by combining multiple local cues into a globalization framework and then transforming the output of any contour detector into a hierarchical region tree.Malik Jitendra et al. [28] improved the performance of object detection using a general algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture.The results of the conducted experiments showed an efficient and high accuracy.
Jacques et al. [21] proposed a new approach for gender recognition based on the upper portion contour.In their proposed approach, the Partial Least Squares (i.e.PLS) method is employed to extract some features from the upper portion (i.e.Head) such as gradient, texture and orientation information, and then used a linear Support Vector Machine (i.e.SVM) for classification.Modi et al. [31] developed a new algorithm for human motion recognition using stationary camera.Artificial Neural Network (i.e.ANN) is used in the proposed algorithm for moving object detection and recognition.The results show that human motion can be correctly classified.

Proposed approach
The main contribution of this paper is a new human object classification approach (i.e.human, non-human) that will enhance the object detection services (i.e.classification accuracy).The new approach uses a simple technique and a robust geometrical model that can emulate some of the machine learning approaches.The flow diagram of the proposed approach can be described by Fig. 1.
To make it clear, the proposed approach is based on a set of parallel and sequential steps, which are partially automated: Steps of the proposed approach: Step 1: Background subtraction using histogram-based techniques with global threshold).
Step 2: Object edge detection using Canny edge detection approach.
Step 3: Extract the boundary edge using the boundary function.Classify the object (i.e.Human or non-human).
Multimed Tools Appl

Object detection and extraction
The first task of the human detection method is to find the region of interest characterization from the images (i.e.object) [3,4].The region of interest (i.e.ROI) refers to the borders of objects [1,4].Generally, there are five major categories of object detection methods, i.e.

Edge detection
Edge detection is an image processing technique forlocating the edges within the images (i.e.objects boundaries).The edge within the image can be found by discontinuities in the brightness of the image (i.e. the sharp changes in intensity) [24].There are several edge detection approaches, i.e. subpixel, threshold, canny, and differential [7].In the proposed approach, we have performed the canny edge detection approach to satisfy the edge detection requirements such as good detection, good localization, and minimal response (i.e.only one response to a certain edge).As shown in Fig. 3.

Boundary edge extraction
After performing a process of edge detection using the canny edge detection approach, the whole object edges (i.To make it clear, the boundary function k is a column vector of point indices representing the sequence of points around the boundary, which is a polygon.

Extract the object upper portion
In this paper, we have focused our study on the upper portion of the object.The upper portion is a very important for human object classification processes because it contains some of human features, i.e. head, neck, shoulders, etc. [21][22][23]25].The upper portion is extracted as shown in Fig. 5 based on a set of parallel and sequential steps, which are partially automated: 1. Obtain the row and column projections from the binary image of the detected contour. 2. Smooth the projection curves.

Scan the smoothed Row projection to perform the following:
A. Find the first non-zero pixel to specify the top of the head.B. Find the minimum value after the top of the head to specify the neck width.

Scan the smoothed column projection to perform the following:
A. Find the height of the neck which corresponds to the first minimum from the top of the head.B. Find the head width which is the maximum value in scanning back from the minimum value and the corresponding height from the head top.
5. Determine the shoulder width as 2.5-3 times of the head width. .Finally, we have observed that there is a unique behavior of the flow of [X] value (i.e.coordinates of X) this behavior is not applicable for non-human object.This unique flow can be simply obtained from head, neck, and shoulder region of the contour.The histogram is used to present the changing of the unique flow behavior of X values for different shaped objects (i.e.human, non-human).To avoid the small angles (i.e.odd values) on the histogram, a mathematical smoothing function has been used as [11,12,40].

Human object Upper and lower peaks
The experiments carried out with our proposed algorithm, the obtained results are shown in Fig. 6.
As mentioned earlier, an intensive study and analysis have been done for the values of [X, Y] coordinates, and we observed that there is a unique behavior of the flow of [X].Therefore, we have designed our object classifier for a human object only based on four parameters.The results of observing and analyzing the histogram of [X] values coordinates shows that the numbers of upper peek points are equal to the number of the lower peak points and they are equal to two points, which is not found in the non-human object, this is considered as the first parameter, as shown in Fig. 7.
The second parameter is built based on the calculations and the measurements which done between the coordinates of the lower and upper peak points as shown in Fig. 8, we found that the distance between the first upper peak point and the second lower peak point (D1) is less than the distance between the second upper peak point and the first lower peak point (D2).As well as, the  Multimed Tools Appl third parameter is based on the calculations and the measurements between the coordinates of the lower and upper peek points as shown in Fig. 8, we found that the distance between the two upper peak points (A1, B2) denoted by (D4) is equal to the distance between the two lower peak points (B1, A2) denoted by (i.e.D3).For more accuracy, a small value was added for the difference between the two distances (ts1), which mean that the distance D4 is equal to the distance D3 ± ts1 and it can be written in mathematical words as the absolute value of subtraction between the distance D3 and the distance D4 is equal to ts1.
For the last parameter, we found that the distance between the first lower peak point and the second upper peak point (B1,B2) which denoted by D2 is more than one third of the distance between the start point and the end point of the shoulder (C1, C2) which denoted by (i.e.D5) and less than two third of the distance between the start point and the end point of the shoulder (C1, C2) which denoted by D5 and this is logically true as the fact in anatomical science [22,23,25].
To make it clear, the proposed object classification approach is based on a set of parallel and sequential parameters, which are partially automated: The shape of the detected object.
The average intensity values of the pixels.
The deviation or the variance between the pixels in the input image.
The local contrast of the image in terms of gray levels.
The uniformity of the texture.
The asymmetry of the probability distribution of a real-valued random variable (i.e.Positive or negative).
The peakedness of the probability distribution of a real-valued random variable.Entropy The randomness of a gray level distribution.
The closeness element distribution in GLCM to the diagonal GLCM.
The gray level linear dependence between the relative pixels.

Diameter D = 2r
Diameter of the edge pixels.

Constrictors of the proposed object classification: Parameter 1:
The number of upper peak point = 2, The number of lower peak points = 2 Parameter 2: The distance D1 < the distance D2 Parameter 3: The distance D3 = the distance D4 ± (ts1), | D3 -D4| = ts1 where the ts1 is a threshold.Parameter 4: The distance D2 > 1/3 of distance D5 And The distance D2 < 2/3 of distance D5, 13D5 < D2 < 13D5 These four parameters simultaneously applied to the human objects only, so that our geometrical parametric model can classify the detected object as human if the result of all of the four parameters are True else, that it will classify as a non-human (see Appendix 1, for the detail of the proposed approach).

Experimental results
In general, this section concentrates on the performance of the proposed approach, and aims to know the classification accuracy of the proposed object classification approach for human and non-human objects with different pose.The proposed approach achieved a higher accuracy rate in terms of object classification.The performed experiments were implemented through the Matlab application tool on a 1.6 GHz core i5 (IV), 8 GB memory and 750 GB hard disk capacities.INRIA dataset (i.e.385 digital images) used in our experiment.Table 1, a few of our automatic testing system results are shown based on the proposed approach with a set of color images.

Evaluation
In order to evaluate the classification accuracy of the proposed approach, the performance evaluation (i.e.classification accuracy) of the proposed approach were conducted in two phases: a comparison between the proposed approach and some of machine learning approaches (i.e.Artificial Neural Networks (ANN), Support Vector Machine (SVM) Model, and Random Forest) phase and a comparison between the proposed approach and five state-of-art approaches in literature phase.The machine learning field was chosen as the domain for evaluation because of the success of data driven artificial intelligence and achieves a high classification accuracy (i.e.human, non-human) [30].
Below, we will first recall the definition of the dataset (i.e.data acquisition) and the main features of the proposed machine learning approach.Then, we will discuss the detailed implementation of the proposed approach and the machine learning approaches.For completeness, comparisons between the proposed geometrical based approach and the machine learning approaches and five state-of-art approaches in literature will also be explained.

Input Image from the Dataset
Image Preprocessing

Data acquisition
In this paper, a public database for benchmarking human detection from digital images is used.
INRIA dataset contains images of humans cropped from a varied set of personal photos [20].The people are usually standing, but appear in any orientation and against a wide variety of background image including crowds.INRIA contains human images cropped in 64 × 128 pixels and non-human images of various sizes (214 × 320-648 × 486 pixels) [41].In this paper, human and non-human images are collected from the INRIA dataset, and then improve the collected images using the image enhancement techniques to have a homogeneous dataset cropped 64 × 128 pixels.

Machine learning approach
In this section, we present a machine learning approaches to evaluating a new human object classification approach (i.e.Human, non-human) that will enhance the object detection services (i.e.Classification accuracy).Machine learning-based image classification commonly needs an image dataset (i.e.Human images, and non-human) to split the data into two categories.Then, the optimal 11 features are extracted as shown in Table 2 from each category for training and testing of the classifier using a machine learning toolkit which is called WEKA [2,8,19].As shown in Fig. 9.
The digital images used in this paper are drawn from the INRIA.A software-based simulation using Matlab was performed to extract the features of the human (i.e.HU) and non-human (i.e.NH).For an example of the extracted features values based on our software simulation.See Table 3.
The excel-sheet of 358 images in total, it was converted to comma delimited file (i.e.CSV) to be identified by the machine learning toolkit (i.e.WEKA).In this paper, 11 features (i.e.optimal features) were chosen to be in the classification process.The distributions of instances over the features are shown in Fig. 10.The performance analysis (i.e.classification accuracy, and computational-cost) of the proposed object classification approach were conducted in two phases: a comparison between the proposed approach and some of machine learning approaches phase and a comparison between the proposed approach and five state-of-art approaches in literature phase.

Comparisons with machine learning approaches
In this section, we provide some experimental results for analysis the performance (i.e.Classification accuracy) of the proposed object classification approach and some of the machine learning approaches (i.e.Artificial Neural Networks (ANN), Support Vector Machine (SVM) Model, and Random Forest).In this paper, the performance analysis for SVM, ANN, Random Forest, and the proposed object classification approach were conducted in two tests: the testing confusion matrix, and accuracy matrix for all classes.
The confusion matrix for SVM, ANN, and Random Forest for the test set is reported in Tables 4, 5 and 6 respectively.The accuracy the confusion matrix is given by: Accuracy ¼ ∑Diagonal Sample of confusion matrix Total Sample In the matrix accuracy for SVM, ANN, and Random Forest approach for all classes, TP rate, FP rate, Precision, Recall, F-Measure, MCC, ROC Area, and PRC were calculated and listed in Tables 7, 8, and 9 respectively, and the corresponding classes distribution are shown in Figs.11, 12 and 13 respectively (see Appendix 2, for more details).
The machine learning approaches (i.e.SVM, ANN, and Random Forest) have introduced and implemented to identify and evaluate the proposed human object classification approach.As the goal of this research, the classification accuracy is a concern to us, and should be kept to a maximum value.Therefore, a comparison between the classification accuracy in our proposed approach and the machine learning approaches are made.As shown in Table 10 and the corresponding flow chart is shown in Fig. 14.  • Requires a pre-defined template or classifier for the human area that must be obtained through training.
• Requires significant processing time to detect the human object.
• Performance degradation can occur if the intensity of the object is similar to the background.
[45] Human Detection using learned part alphabet and pose dictionary.
[29] Occlusion handling via random subspace classifiers for human detection.
[11] Histograms of oriented gradients for human detection.[14] Training deformable object models for human detection based on alignment and clustering.
Proposed approach Geometrical-based approach for robust human image detection • Applicable to various environmental conditions and detect the human objects in various scales.
• Simple technique and a robust geometrical model that can emulate some of the machine learning approaches.
• Does not require a training procedure to obtain the classifier of human detection.
• Applicable to detect human in still images.
Multimed Tools Appl

Comparisons with state of the art approaches
In this section, we compare the performance of the proposed object classification approach to some of state-of-art approaches in literature.In our comparisons, five state-of-art approaches are chosen due to their high citation rate.The comparisons were implemented and evaluated in the same environment and conditions.From the experiment results, we have observed the advantages and disadvantages for each approach as shown in Table 11.
There are several terms that are commonly used to measure the performance (i.e.accuracy) of the human detection approaches such as positive predictive value (PPV), and the computational cost (Time) [11].To calculate the PPV, we have: Where, #TP is the number of true positive and #FP is the number of false positive.The PPV value and the computational cost for each approach are calculated and listed in Table 12.
In general, a higher value of PPV means a higher accuracy of human detection.The obtained results demonstrate that the proposed object detection and classification approach is efficient and achieved a comparable accuracy and computational cost to other state-of-art approaches.

Conclusion
Human object detection and classification in the images is a challenging work.In this paper, we propose a new object classification approach to improve the object classification accuracy.The new approach will use a simple and robust geometrical model to classify the detected object as human or non-human in the images.In the proposed approach, the object is detected and then the detected object under different conditions can be accurately classified (i.e.human, non-human) by combining the features that are extracted from the upper portion of the contour and the geometrical model parameters.The performance evaluation (i.e.classification accuracy) of the proposed approach were conducted in two phases: a comparison between the proposed approach and some of machine learning approaches (i.e.Artificial Neural Networks (ANN), Support Vector Machine (SVM) Model, and Random Forest) phase and a comparison between the proposed approach and five state-of-art approaches in literature phase.The experimental results show the classification accuracy for the machine learning approaches and the proposed approach as the following: SVM is equal to 80.7263%, ANN is equal to 89.6648%, random forest is equal to 89.9441%, and the proposed approach is equal to 92.7374%.Furthermore, the obtained results demonstrate that the proposed object detection and classification approach is efficient and achieved a comparable accuracy and computational cost to other state-of-art approaches.This indicates that the proposed approach is efficient and achieves higher classification accuracy than machine learning approaches and other state-of-art approaches.

Step 4 :
Obtain [X, Y] coordinates for the boundary contour.Step 5: Extract the upper portion of the contour (i.e.Selected object).Step 6: Obtain [X, Y] coordinates for the upper portion of the contour.Step 7: Represent the obtained X values coordinate in a histogram.Step 8: Smooth the histogram using mathematical smooth functions.Step 9: Find the upper and lower peaks.Step 10: Perform the geometric model parameters.Step 11:

Fig. 1 Fig. 2
Fig. 1 Flow diagram of the proposed approach architecture e. Internal and external edges) are obtained.To extract the boundary edges only (i.e.External edges), weperformed the boundaryfunctionk = boundary(x,y) in the proposed approach.The boundary function, returns a vector of point indices representing a single conforming 2-D boundary around the points (x,y), and then obtain the points (x (k), y (k)) form the boundary.As shown in Fig. 4.

Fig. 3
Fig. 3 Edge detection using canny edge detection approach

Fig. 7 Fig. 8
Fig. 7 Upper and lower peaks for human object

Fig. 9
Fig. 9 Flow diagram of the proposed approach architecture -Features extraction

Fig. 10
Fig. 10 Extracted feature's value from human and non-human images

Fig. 13
Fig. 12 Classes distribution for ANN

Fig. 14
Fig. 14 Flow chart of classification accuracy

Table 1
Experimental results using the proposed approach

Table 2
Features of the machine learning approach

Table 3
Extracted feature's value from human and non-human images

Table 4
SVM confusion matrix

Table 7
SVM accuracy matrix for all classes TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

Table 9
Random Forest accuracy matrix for all classes TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

Table 11
Comparison between the proposed approach and some of state-of-the art approaches in literature

Table 12
Positive predictive value and computational cost for each approach