<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JSIP</journal-id><journal-title-group><journal-title>Journal of Signal and Information Processing</journal-title></journal-title-group><issn pub-type="epub">2159-4465</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jsip.2017.82007</article-id><article-id pub-id-type="publisher-id">JSIP-76264</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  Real-Time Face Detection and Recognition in Complex Background
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Xin</surname><given-names>Zhang</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Thomas</surname><given-names>Gonnot</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Jafar</surname><given-names>Saniie</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, Illinois, USA</addr-line></aff><author-notes><corresp id="cor1">* E-mail:<email>tgonnot@hawk.iit.edu(TG)</email>;</corresp></author-notes><pub-date pub-type="epub"><day>05</day><month>05</month><year>2017</year></pub-date><volume>08</volume><issue>02</issue><fpage>99</fpage><lpage>112</lpage><history><date date-type="received"><day>March</day>	<month>25,</month>	<year>2017</year></date><date date-type="rev-recd"><day>Accepted:</day>	<month>May</month>	<year>16,</year>	</date><date date-type="accepted"><day>May</day>	<month>19,</month>	<year>2017</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  This paper provides efficient and robust algorithms for real-time face detection and recognition in complex backgrounds. The algorithms are implemented using a series of signal processing methods including Ada Boost, cascade classifier, Local Binary Pattern (LBP), Haar-like feature, facial image pre-processing and Principal Component Analysis (PCA). The Ada Boost algorithm is implemented in a cascade classifier to train the face and eye detectors with robust detection accuracy. The LBP descriptor is utilized to extract facial features for fast face detection. The eye detection algorithm reduces the false face detection rate. The detected facial image is then processed to correct the orientation and increase the contrast, therefore, maintains high facial recognition accuracy. Finally, the PCA algorithm is used to recognize faces efficiently. Large databases with faces and non-faces images are used to train and validate face detection and facial recognition algorithms. The algorithms achieve an overall true-positive rate of 98.8% for face detection and 99.2% for correct facial recognition.
 
</p></abstract><kwd-group><kwd>Face Detection</kwd><kwd> Facial Recognition</kwd><kwd> Ada Boost Algorithm</kwd><kwd> Cascade Classifier</kwd><kwd> Local Binary Pattern</kwd><kwd> Haar-Like Features</kwd><kwd> Principal Component Analysis</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Real-time face detection and facial recognition play an important role in applications such as robot intelligence, smart cameras, security monitoring or even criminal identification. Conventional algorithms for face detection and facial recognition are designed for still-face images or color images. In color images, the colors increase data complexity by mapping pixels onto a high-dimensional space, which greatly reduces the processing speed and accuracy of the face detection and recognition [<xref ref-type="bibr" rid="scirp.76264-ref1">1</xref>] .</p><p>There are several approaches towards facial recognition problems. Given the fact, the faces are usually round or oval with same color, one simplest approach is to use the color segmentation to detect faces. However, using color segmentation is not able to adapt to the changing environment, such as lighting conditions. More adaptive and robust methods may not be able to operate in real time since they require more computational power. Moreover, adaptive algorithms usually employ statistical concepts in various degrees, such as template matching [<xref ref-type="bibr" rid="scirp.76264-ref2">2</xref>] , Support Vector Machine (SVM) [<xref ref-type="bibr" rid="scirp.76264-ref3">3</xref>] , color segmentation [<xref ref-type="bibr" rid="scirp.76264-ref4">4</xref>] or neural network [<xref ref-type="bibr" rid="scirp.76264-ref5">5</xref>] . More reliable descriptors such as Histogram of Oriented Gradient (HOG) [<xref ref-type="bibr" rid="scirp.76264-ref6">6</xref>] , Scale-Invariant Feature Transform (SIFT) [<xref ref-type="bibr" rid="scirp.76264-ref7">7</xref>] , Local Binary Pattern (LBP) [<xref ref-type="bibr" rid="scirp.76264-ref8">8</xref>] , or Haar-like features [<xref ref-type="bibr" rid="scirp.76264-ref9">9</xref>] are used to determine facial features for face detection. The facial recognition is based on Principal Component Analysis (PCA) [<xref ref-type="bibr" rid="scirp.76264-ref10">10</xref>] , Linear Discriminant Analysis (LDA) [<xref ref-type="bibr" rid="scirp.76264-ref11">11</xref>] , holistic matching method [<xref ref-type="bibr" rid="scirp.76264-ref12">12</xref>] and feature-based method [<xref ref-type="bibr" rid="scirp.76264-ref7">7</xref>] . For practical applications, the faces need to be detected and recognized in real-time and often in complex backgrounds.</p><p>The algorithms proposed in this paper process gray-scale images to detect and recognize faces in real-time with high accuracy. The combination of Ada Boost algorithm and the cascade classifier [<xref ref-type="bibr" rid="scirp.76264-ref13">13</xref>] improves the detection accuracy. The face detection algorithm uses a cascade classifier based on the <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x2.png" xlink:type="simple"/></inline-formula> descriptor [<xref ref-type="bibr" rid="scirp.76264-ref8">8</xref>] , providing a higher processing speed. The eye detection also uses a cascade classifier but based on the Haar-like descriptor to ensure low false-positive face detection rate. The result of facial recognition training can be improved significantly through an efficient pre-processing on training data. After training, the PCA algorithm is used for the facial recognition. The flowchart for real-time face detection and recognition is shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>.</p><p>The implemented algorithm can be segmented into three stages: 1) Faces and eyes detection; 2) Facial images normalization and enhancement, and 3) Facial recognition and face sample collection. In stage 1, two different cascade classifiers are used to detect the faces and the eyes respectively. The training process of these two classifiers is done by the Ada Boost algorithm. In stage 2, faces detected in previous stage are normalized to a fixed size and orientation. In this stage, the backgrounds are discarded; the contrast and lighting get enhanced. In stage 3, the algorithm tracks the differences of faces in detection windows. In the case of significant difference, the algorithm will recognize the face using PCA and collect it to train the recognition algorithm further. With the help of preprocessing and eye detection module, the method proposed in this paper can operate more accurately regardless of the background.</p></sec><sec id="s2"><title>2. Descriptors for Real-Time Detection</title><sec id="s2_1"><title>2.1. Descriptor</title><p>The <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x3.png" xlink:type="simple"/></inline-formula> descriptor is used to extract facial features for the face detection. LBP stands for Local Binary Pattern, and every pattern of the facial image is encoded and counted to construct the spatially enhanced histogram representing local primitives. The subscript of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x4.png" xlink:type="simple"/></inline-formula> indicates the LBP descriptor is using</p><fig id="fig1"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> Flowchart for real-time face detection and recognition</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x5.png"/></fig><p>8 sampling points within a radius of 2 pixels. The <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x6.png" xlink:type="simple"/></inline-formula> superscript indicates that the descriptor is using uniform patterns. This descriptor uses 58 bins to include 58 uniform patterns and 1 bin to include 198 non-uniform patterns. Uniform patterns account for almost 90% of the local primitives [<xref ref-type="bibr" rid="scirp.76264-ref14">14</xref>] and there are two transitions from 0 - 1 or 1 - 0 in each 8-bit binary number at most. Due to the shorter length of the histogram, the calculation can be greatly simplified by using the <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x7.png" xlink:type="simple"/></inline-formula> descriptor. Each sample histogram is compared with the template histogram to find the threshold for each region. The encoding process of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x8.png" xlink:type="simple"/></inline-formula> descriptor is shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>.</p></sec><sec id="s2_2"><title>2.2. Haar-Like Descriptor</title><p>The Haar-like descriptor is utilized to extract eye features. Each Haar-like feature is composed of neighboring rectangular regions, which are shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>. Haar-like features have multiple neighboring rectangular regions. The values</p><fig id="fig2"  position="float"><label><xref ref-type="fig" rid="fig2">Figure 2</xref></label><caption><title> <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x10.png" xlink:type="simple"/></inline-formula>descriptor encoding</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x9.png"/></fig><fig id="fig3"  position="float"><label><xref ref-type="fig" rid="fig3">Figure 3</xref></label><caption><title> Haar-like features [<xref ref-type="bibr" rid="scirp.76264-ref9">9</xref>] </title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x11.png"/></fig><p>of the pixels in the black rectangular regions are subtracted from the values of the pixels in the white rectangular regions. The total represents the value of a Haar-like feature. While a Haar-like feature goes through the detection window, the area with the minimum value is the best match for this feature.</p></sec></sec><sec id="s3"><title>3. Face Detection Algorithms</title><sec id="s3_1"><title>3.1. Face Detection Classifier</title><p>The Ada Boost algorithm [<xref ref-type="bibr" rid="scirp.76264-ref15">15</xref>] is used to extract the best features to detect the faces. The best features are chosen as weak classifiers and then concatenated together as a weighted combination of these features to construct a strong classifier, which is shown in the following equation:</p><disp-formula id="scirp.76264-formula16"><label>(1)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x12.png"  xlink:type="simple"/></disp-formula><p>In Equation (1), <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x13.png" xlink:type="simple"/></inline-formula>are n weak classifiers used to construct a strong classifier<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x14.png" xlink:type="simple"/></inline-formula>. The parameters <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x15.png" xlink:type="simple"/></inline-formula> are weights associated with the n weak classifiers. The strong classifier can be used to detect faces with the following equation:</p><disp-formula id="scirp.76264-formula17"><label>(2)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x16.png"  xlink:type="simple"/></disp-formula><p>In Equation (2), <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x17.png" xlink:type="simple"/></inline-formula>is the threshold by the strong classifier to detect a face. “1” indicates that a face is present while “0” indicates that no face is detected. In our paper, the trained strong classifier <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x18.png" xlink:type="simple"/></inline-formula> can correctly detect faces with a high detection accuracy of 98.8%.</p><p>The cascade classifiers are trained using the Ada Boost algorithm. The cascade classifier consists in a series of tests on the input features, as shown in <xref ref-type="fig" rid="fig4">Figure 4</xref>. Selected features are separated into several stages and each stage is trained to be a strong classifier with best weak classifiers. The tested implementation uses 120 LBP and 32 Harr features for weak classifiers. Each stage is responsible for deciding whether the detection window might contain a face or not. The window will be discarded immediately once it fails at any stage. The result of this cascading is that the areas without faces will be discarded within the early stages, and therefore processed faster. The number of stages is defined during learning and is picked to achieve a predetermined detection accuracy.</p><p>The Chi-Squared difference [<xref ref-type="bibr" rid="scirp.76264-ref16">16</xref>] is used by the face detection classifier. The Chi-Squared difference is calculated between the LBP encoded histogram of a face detection region and the LBP encoded histogram of a predefined template image which is obtained by averaging 2400 facial images. The images used contain faces of various skin-colors, sexes, ages and are all picked from the MIT CBCL face database. Then, the difference is compared with a predefined threshold for classification. The Chi-Square difference equation is:</p><disp-formula id="scirp.76264-formula18"><label>(3)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x19.png"  xlink:type="simple"/></disp-formula><fig id="fig4"  position="float"><label><xref ref-type="fig" rid="fig4">Figure 4</xref></label><caption><title> Flowchart of a cascading classifier</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x20.png"/></fig><p>In Equation (3), <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x21.png" xlink:type="simple"/></inline-formula>and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x21.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x22.png" xlink:type="simple"/></inline-formula> are the numbers of features in the i-th bin of the LBP encoded histogram of the detection region and the template image respectively. If the Chi-Squared difference is smaller than the threshold, it means that the detection window contains a face. Results of the face detection in various conditions are shown in <xref ref-type="fig" rid="fig5">Figure 5</xref>.</p></sec><sec id="s3_2"><title>3.2. Eyes Detection</title><p>The Haar-like descriptor is used to detect both eyes of the face in order to enhance the face detection accuracy. The origin of the coordinate system for the facial image is chosen to be the top-left point. Two rectangular eye-search regions with the same size are extracted from each facial image at four predefined positions. For the left eye, the region extends on the x axis from 10% of the image width to 38%, and for the y axis from 15% of the image height to 40%.Since the right eye’s search region is symmetric with respect to left-eye search region, the same proportions are used from the other side of the image. <xref ref-type="fig" rid="fig6">Figure 6</xref> shows the result of the eyes detection algorithm.</p><fig-group id="fig5"><label><xref ref-type="fig" rid="fig5">Figure 5</xref></label><caption><title> Face Detection under Various Conditions. Upper left, occlusion at bottom; upper middle, occlusion on top; upper right, face in shadow; lower left; object near face; lower middle, direct light on face; lower right, poor light condition.</title></caption><fig id ="fig5_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x25.png"/></fig><fig id ="fig5_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x24.png"/></fig><fig id ="fig5_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x23.png"/></fig><fig id ="fig5_4"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x28.png"/></fig><fig id ="fig5_5"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x27.png"/></fig><fig id ="fig5_6"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x26.png"/></fig></fig-group><fig id="fig6"  position="float"><label><xref ref-type="fig" rid="fig6">Figure 6</xref></label><caption><title> Eyes detection in eye-search regions</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x29.png"/></fig></sec></sec><sec id="s4"><title>4. Facial Recognition</title><sec id="s4_1"><title>4.1. Affine Transformation</title><p>An affine transformation [<xref ref-type="bibr" rid="scirp.76264-ref17">17</xref>] is used to rectify the orientation and scale of the detected facial images to improve accuracy of recognition. An affine matrix is adopted to scale the detected facial image to the desired size, and rotate it so that the two eyes are horizontal.</p><disp-formula id="scirp.76264-formula19"><label>(4)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x30.png"  xlink:type="simple"/></disp-formula><p>In Equation (4), <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x31.png" xlink:type="simple"/></inline-formula>is the affine matrix and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x31.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x32.png" xlink:type="simple"/></inline-formula> are the scaling ratios in the x, y directions. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x31.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x32.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x33.png" xlink:type="simple"/></inline-formula>are the translation factors in x, y directions. θ is the rotation angle of the image. The position of each pixel of the original facial image is multiplied by the affine matrix to constitute the corrected image, with a resolution of 70 &#215; 70 pixels.</p><p><xref ref-type="fig" rid="fig7">Figure 7</xref> shows a facial image after correction. The two eyes are now horizontal and the image is resized to a standard dimension. The image is cropped to only show the facial features and discard the background.</p></sec><sec id="s4_2"><title>4.2. Histogram Equalization</title><p>The facial images of the same person can change drastically in various lighting conditions. A histogram equalization algorithm [<xref ref-type="bibr" rid="scirp.76264-ref18">18</xref>] is used to enhance the contrast of the detected facial images. The algorithm consists in replacing the pixel-values using a function designed to spread the repartition of the histogram. The function is given by the following Equation (5).</p><disp-formula id="scirp.76264-formula20"><label>(5)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x34.png"  xlink:type="simple"/></disp-formula><p>In this equation, CDF(v) is the cumulative distribution function of pixels with value v for calculating the equalized value H(v). M,N are the numbers of rows and columns for the facial image respectively. L is 256 and represents the gray- scale range.</p><p><xref ref-type="fig" rid="fig8">Figure 8</xref> shows the enhancement of the facial image using the histogram equalization algorithm. In strong lighting condition, however, one side of the face can be more exposed to the light than the other side, resulting in a significant lighting difference between the two sides. <xref ref-type="fig" rid="fig9">Figure 9</xref> shows an alternative</p><fig-group id="fig7"><label><xref ref-type="fig" rid="fig7">Figure 7</xref></label><caption><title> Affine transformation of a facial image.</title></caption><fig id ="fig7_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x35.png"/></fig><fig id ="fig7_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x36.png"/></fig><fig id ="fig7_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x37.png"/></fig></fig-group><fig-group id="fig8"><label><xref ref-type="fig" rid="fig8">Figure 8</xref></label><caption><title> Histogram equalization in weak lighting condition.</title></caption><fig id ="fig8_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x38.png"/></fig><fig id ="fig8_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x39.png"/></fig><fig id ="fig8_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x40.png"/></fig></fig-group><fig-group id="fig9"><label><xref ref-type="fig" rid="fig9">Figure 9</xref></label><caption><title> Separated histogram equalization in strong lighting condition.</title></caption><fig id ="fig9_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x41.png"/></fig><fig id ="fig9_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x42.png"/></fig><fig id ="fig9_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x43.png"/></fig></fig-group><fig-group id="fig10"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>0</label><caption><title> Improved histogram equalization in strong lighting condition.</title></caption><fig id ="fig10_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x44.png"/></fig><fig id ="fig10_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x45.png"/></fig><fig id ="fig10_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x46.png"/></fig></fig-group><p>processing, applying the histogram equalization separately on both sides of the face.</p><p>In <xref ref-type="fig" rid="fig9">Figure 9</xref>, there is still a high lightning difference between both sides of the face, which might affect the recognition accuracy. In <xref ref-type="fig" rid="fig1">Figure 1</xref>0, we propose an improved histogram equalization to decrease this lightning difference by mixing the separated histogram equalization with the whole-face histogram equalization gradually from the left or right edge to the center. Therefore, the far left or right region applies the separated histogram equalization and the central region smoothly mixes left or right equalized values and the whole-face equalized values.</p></sec><sec id="s4_3"><title>4.3. Gaussian Filter</title><p>A Gaussian filter [<xref ref-type="bibr" rid="scirp.76264-ref19">19</xref>] is used to remove noise in the pre-processed facial images for a high facial recognition accuracy. A convolution matrix produced by a Gaussian function is used to smooth the facial images. The 2-D Gaussian function is given in Equation (6).</p><disp-formula id="scirp.76264-formula21"><label>(6)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x47.png"  xlink:type="simple"/></disp-formula><p>The 3 &#215; 3 normalized convolution matrix with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x48.png" xlink:type="simple"/></inline-formula> is adopted for smoothing while preserving edges, which is shown in Equation (7).</p><disp-formula id="scirp.76264-formula22"><label>(7)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x49.png"  xlink:type="simple"/></disp-formula><p>The convolution process is defined by the Equation (8). For each pixel of the output image I’, the pixels of the original image I around this position are multiplied by the coefficients of the matrix H, and then summed up. The resulting image is a smaller image with a size of 68 &#215; 68.</p><disp-formula id="scirp.76264-formula23"><label>(8)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x50.png"  xlink:type="simple"/></disp-formula><p><xref ref-type="fig" rid="fig1">Figure 1</xref>1 shows that the Gaussian filter is removing the high-frequency noise in the pre-processed facial image.</p></sec><sec id="s4_4"><title>4.4. Principal Component Analysis</title><p>The desired facial images are first collected as samples for training the new coordinate system. Every pixel of the image is represented by a variable in one dimension for describing facial features, therefore the features of each desired facial image can be represented by a column vector with 70 &#215; 70 = 4900 dimensions. The PCA algorithm is used to recognize high-dimensional facial images with few principal components. The new base vectors <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x51.png" xlink:type="simple"/></inline-formula> are given by maximizing the sample variance and minimizing the mean squared error.</p><disp-formula id="scirp.76264-formula24"><label>(9)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x52.png"  xlink:type="simple"/></disp-formula><disp-formula id="scirp.76264-formula25"><label>(10)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x53.png"  xlink:type="simple"/></disp-formula><p>In Equation (9), the collected facial sample in the original coordinate system is represented as x. The collected facial sample which is reconstructed from the principal components is represented as<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x54.png" xlink:type="simple"/></inline-formula>. Equation (10) shows that each base vector is orthogonal to each other. The Lagrange multiplier is used to find the local minima of the function. The solution is shown in Equations ((11) and (12)).</p><disp-formula id="scirp.76264-formula26"><label>(11)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x55.png"  xlink:type="simple"/></disp-formula><fig-group id="fig11"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>1</label><caption><title> Gaussian smoothing.</title></caption><fig id ="fig11_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x56.png"/></fig><fig id ="fig11_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x57.png"/></fig><fig id ="fig11_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x58.png"/></fig></fig-group><disp-formula id="scirp.76264-formula27"><label>(12)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/5-3400497x59.png"  xlink:type="simple"/></disp-formula><p><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x60.png" xlink:type="simple"/></inline-formula>is the covariance matrix of the sample vectors whose common features are removed by reducing the average vector of data. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x60.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x61.png" xlink:type="simple"/></inline-formula>is the average vector. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x60.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x61.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x62.png" xlink:type="simple"/></inline-formula>are the eigenvalues of the covariance matrix. N indicates the dimension for each sample vector. P is the number of collected samples. The best base vector <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x60.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x61.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x62.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x63.png" xlink:type="simple"/></inline-formula> is the eigenvector of the covariance matrix having the largest eigenvalue<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x60.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x61.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x62.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x63.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/5-3400497x64.png" xlink:type="simple"/></inline-formula>. The flowchart of the PCA algorithm for facial recognition is shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>2. A value of D = 100 is selected as the number of principal components to represent collected samples. A new face can be defined with only 100 dimensions, since the 100 principal components in the new coordinate system can illustrate most features of the new face. The projected values of each collected facial image on the 100 principal components are constructed into a 100-dimensional column vector for representing the training samples. If the difference between the reconstructed face and the new face is above the threshold of T = 0.4, it means that the new face was not recorded and it is displayed as an “unknown face”. Otherwise, the new face is identified as the sample face with the closest match.</p><fig id="fig12"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>2</label><caption><title> Flowchart of facial recognition with PCA algorithm</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x65.png"/></fig></sec></sec><sec id="s5"><title>5. Results</title><p>The sample images used to train the face detector come from the MIT CBCL Face Database [<xref ref-type="bibr" rid="scirp.76264-ref20">20</xref>] . It includes 2492 faces with different identities, skin-colors, head poses, and 4548 non-faces images. The eye samples were extracted from the detected facial images in order to train the eye detector. In this paper, the algorithms were run on a computer with an Intel 2.50 GHz Core i7-3537U CPU at VGA resolution on a single thread. The processing time for detecting every face is of 11.4 ms and the processing time for detecting every pair of eyes within the facial regions is of 15.3 ms. With the help of the cascade classifier, the system is able to eliminate most non-facial features with little computational work. The resulting system is almost 3 times faster than the Joint Cascade detector [<xref ref-type="bibr" rid="scirp.76264-ref21">21</xref>] that takes 28.6 ms for face detection on a 2.93 GHz CPU at same resolution, and about 3000 times faster than Zhu et al detector [<xref ref-type="bibr" rid="scirp.76264-ref1">1</xref>] that detects every face in 33.8 s, still at VGA resolution. In order to test the face detection accuracy, 2836 faces and 3121 non-faces were randomly selected from the MIT CBCL Face Database [<xref ref-type="bibr" rid="scirp.76264-ref20">20</xref>] as well as the NIST Mugshot Identification Database [<xref ref-type="bibr" rid="scirp.76264-ref22">22</xref>] for cross validation purposes. By combining face normalization and eye detection, the algorithm achieves 98.8% detection accuracy and has a higher accuracy than other face detection algorithms, compared to 73.68% for the Color Based Segmentation [<xref ref-type="bibr" rid="scirp.76264-ref4">4</xref>] , 97.14% for the Head Hunter [<xref ref-type="bibr" rid="scirp.76264-ref23">23</xref>] . It is to be noted that the fact that these two methods where tested on different databases but with similar properties. <xref ref-type="table" rid="table1">Table 1</xref> shows the test outcome for the facial detection, achieving a sensitivity of 99.2%, a specificity of 98.4% and a total accuracy of 98.8%. The Facial Recognition Technology Database [<xref ref-type="bibr" rid="scirp.76264-ref24">24</xref>] , containing 3682 face samples of 526 subjects under various viewing conditions, is used to train the facial recognition algorithm and validate its results, resulting in a 99.2% positive recognition rate in this paper.</p><p><xref ref-type="fig" rid="fig1">Figure 1</xref>3 shows that faces can be recognized in different real world conditions, such as picking up the cell phone or with occlusions on the hair. <xref ref-type="fig" rid="fig1">Figure 1</xref>4</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Test outcome</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Input</th><th align="center" valign="middle"  colspan="2"  >Outcome</th></tr></thead><tr><td align="center" valign="middle" >Face detected</td><td align="center" valign="middle" >No face detected</td></tr><tr><td align="center" valign="middle" >Face images (2836)</td><td align="center" valign="middle" >2813</td><td align="center" valign="middle" >39</td></tr><tr><td align="center" valign="middle" >Non-face images (2400)</td><td align="center" valign="middle" >23</td><td align="center" valign="middle" >2361</td></tr></tbody></table></table-wrap><fig-group id="fig13"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>3</label><caption><title> Real-time facial recognition in various conditions.</title></caption><fig id ="fig13_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x67.png"/></fig><fig id ="fig13_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x66.png"/></fig></fig-group><fig-group id="fig14"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>4</label><caption><title> Real-time facial recognition in complex backgrounds.</title></caption><fig id ="fig14_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x69.png"/></fig><fig id ="fig14_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x68.png"/></fig><fig id ="fig14_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x71.png"/></fig><fig id ="fig14_4"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x70.png"/></fig></fig-group><fig-group id="fig15"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>5</label><caption><title> Real-time multi-person facial recognition.</title></caption><fig id ="fig15_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x73.png"/></fig><fig id ="fig15_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/5-3400497x72.png"/></fig></fig-group><p>shows that faces can be recognized under various backgrounds. <xref ref-type="fig" rid="fig1">Figure 1</xref>5 shows that multiple faces can be recognized real-time.</p></sec><sec id="s6"><title>6. Conclusions</title><p>Our algorithms can detect and recognize faces with high accuracy in real-time. It has a faster detection speed compared to other detection methods. The eyes detection is used to increase the face detection accuracy. The facial recognition performances are also greatly improved by using facial components alignment, contrast enhancement and image smoothing. Images of faces are collected as training samples in real-time and recognized under various conditions including among other faces.</p><p>Future work involves the training of new classifiers capable to expand the facial recognition to a wider range of facial orientations. The head rotation can be estimated so that the algorithm can correct further the facial image and maintain an accurate recognition.</p></sec><sec id="s7"><title>Cite this paper</title><p>Zhang, X., Gonnot, T. and Saniie, J. (2017) Real-Time Face Detection and Recognition in Complex Background. Journal of Signal and Information Processing, 8, 99-112. https://doi.org/10.4236/jsip.2017.82007</p></sec></body><back><ref-list><title>References</title><ref id="scirp.76264-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Zhu, X. and Ramanan, D. (2012) Face Detection, Pose Estimation and Landmark Localization in the Wild. IEEE Conference on Computer Vision and Pattern Recognition, Providence, 16-21 June 2012, 2879-2886.</mixed-citation></ref><ref id="scirp.76264-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Tsitsoulis, A. and Bourbakis, N.G. (2015) A Methodology for Extracting Standing Human Bodies From Single Images. IEEE Transactions on Human-Machine Systems, 45, 327-338. https://doi.org/10.1109/THMS.2015.2398582</mixed-citation></ref><ref id="scirp.76264-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Yanhun, Z. and Chongqing, L. (2003) Face Recognition Based on Support Vector Machine and Nearest Neighbor Classifier. Journal of Systems Engineering and Electronics, 14, 73-76.</mixed-citation></ref><ref id="scirp.76264-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Tayal, Y., Lamba, R. and Padhee, S. (2012) Automatic Face Detection Using Color Based Segmentation. International Journal of Scientific and Research Publications, 2, 1-7.</mixed-citation></ref><ref id="scirp.76264-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Tang, J., Deng, C., Huang, G.B. and Zhao, B. (2015) Compressed-Domain Ship Detection on Spaceborne Optical Image Using Deep Neural Network and Extreme Learning Machine. IEEE Transactions on Geoscience and Remote Sensing, 53, 1174-1185. https://doi.org/10.1109/TGRS.2014.2335751</mixed-citation></ref><ref id="scirp.76264-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Su, C.Y. and Yang, J.F. (2014) Histogram of Gradient Phases: A New Local Descriptor for Face Recognition. Computer Vision, 8, 556-567.  
https://doi.org/10.1049/iet-cvi.2013.0208</mixed-citation></ref><ref id="scirp.76264-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Pavithra, R., Usha Ruby, A. and Chellin Chandran, J.G. (2014) Scale Invariant Feature Transform Based Face Recognition from a Single Sample per Person. International Journal of Computational Engineering Research, 4, 41-47.</mixed-citation></ref><ref id="scirp.76264-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Ahonen, T., Hadid, A. and Pietikainen, M. (2006) Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 2037-2041.  
https://doi.org/10.1109/TPAMI.2006.244</mixed-citation></ref><ref id="scirp.76264-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Lienhart, R. and Maydt, J. (2002) An Extended Set of Haar-Like Features for Rapid Object Detection. 2002 International Conference on Image Processing, Vol. 1, Rochester, 22-25 September 2002, I-900-I-903.  
https://doi.org/10.1109/icip.2002.1038171</mixed-citation></ref><ref id="scirp.76264-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Georgescu, D. (2011) A Real-Time Face Recognition System Using Eigenfaces. Journal of Mobile, Embedded and Distributed Systems, 3, 193-204.</mixed-citation></ref><ref id="scirp.76264-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Li, Z., Lin, D. and Tang, X. (2009) Nonparametric Discriminant Analysis for Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 755-761. https://doi.org/10.1109/TPAMI.2008.174</mixed-citation></ref><ref id="scirp.76264-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Ding, C., Xu, C. and Tao, D. (2015) Multi-Task Pose-Invariant Face Recognition. IEEE Transactions on Image Processing, 24, 980-993.  
https://doi.org/10.1109/TIP.2015.2390959</mixed-citation></ref><ref id="scirp.76264-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Shen, C., Paisitkriangkrai, S. and Zhang, J. (2011) Efficiently Learning a Detection Cascade with Sparse Eigen-Vectors. IEEE Transactions on Image Processing, 20, 22-35. https://doi.org/10.1109/TIP.2010.2055880</mixed-citation></ref><ref id="scirp.76264-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Maturana, D., Mery, D. and Soto, A. (2009) Face Recognition with Local Binary Patterns, Spatial Pyramid Histograms and Naive Bayes nearest Neighbor Classification. 2009 International Conference of the Chilean Computer Science Society, Santiago, 10-12 November 2009, 125-132. https://doi.org/10.1109/SCCC.2009.21</mixed-citation></ref><ref id="scirp.76264-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Mehmood, K. and Ahmad, B. (2013) Implementation of Face Detection System Using Adaptive Boosting Algorithm. International Journal of Computer Applications, 76, 51-57. https://doi.org/10.5120/13223-0639</mixed-citation></ref><ref id="scirp.76264-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Noh, S. (2012) χ2 Metric Learning for nearest Neighbor Classification and Its Analysis. 2012 21st International Conference on Pattern Recognition, Tsukuba, 11-15 November 2012, 991-995.</mixed-citation></ref><ref id="scirp.76264-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Pei, S.C. and Hsiao, Y.Z. (2015) Spatial Affine Transformations of Images by Using Fractional Shift Fourier Transform. 2015 IEEE International Symposium on Circuits and Systems, Lisbon, 24-27 May 2015, 1586-1589.  
https://doi.org/10.1109/ISCAS.2015.7168951</mixed-citation></ref><ref id="scirp.76264-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Peddigari, V.R., Srinivasa, P. and Kumar, R. (2015) Enhanced ICA Based Face Recognition Using Histogram Equalization and Mirror Image Superposition. 2015 IEEE International Conference on Consumer Electronics, Las Vegas, 9-12 January 2015, 625-628. https://doi.org/10.1109/ICCE.2015.7066555</mixed-citation></ref><ref id="scirp.76264-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Reisert, M. and Burkhardt, H. (2008) Complex Derivative Filters. IEEE Transactions on Image Processing, 17, 2265-2274. https://doi.org/10.1109/TIP.2008.2006601</mixed-citation></ref><ref id="scirp.76264-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Center for Biological and Computational Learning at MIT and MIT. CBCL Face Database.  
http://cbcl.mit.edu/projects/cbcl/software-datasets/FaceData1Readme.html</mixed-citation></ref><ref id="scirp.76264-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Chen, D., Ren, S., Wei, Y., Cao, X. and Sun, J. (2014) Joint Cascade Face Detection and Alignment. Computer Vision ECCV, Zurich, 6-12 September 2014, 109-122.</mixed-citation></ref><ref id="scirp.76264-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">National Institute of Standards and Technology. NIST Mugshot Identification Database. http://www.nist.gov/srd/nistsd18.cfm</mixed-citation></ref><ref id="scirp.76264-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Mathias, M., Benenson, R., Pedersoli, M. and Van Gool, L. (2014) Face Detection without Bells and Whistles. Computer Vision ECCV, Zurich, 6-12 September 2014, Vol. 8692, 720-735.</mixed-citation></ref><ref id="scirp.76264-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">FERET Program. The Facial Recognition Technology Database.  
http://www.itl.nist.gov/iad/humanid/feret/</mixed-citation></ref></ref-list></back></article>