A High Performance Dual Stage Face Detection Algorithm Implementation using FPGA Chip and DSP Processor
Subject Areas : Signal ProcessingM V Ganeswara Rao 1 * , P Ravi Kumar 2 , T Balaji 3
1 - Department of Electronics and Communication Engineering, Shri Vishnu Engineering College for Women, Bhimavaram, AP, India.
2 - Department of Electronics and Communication Engineering, Shri Vishnu Engineering College for Women, Bhimavaram, AP, India.
3 - Department of Electronics and Communication Engineering PVP Siddhartha Institute of Technology, Vijayawada, AP, India
Keywords: Face detection, Heterogeneous System, FPGA, DSP,
Abstract :
A dual stage system architecture for face detection based on skin tone detection and Viola and Jones face detection structure is presented in this paper. The proposed architecture able to track down human faces in the image with high accuracy within time constrain. A non-linear transformation technique is introduced in the first stage to reduce the false alarms in second stage. Moreover, in the second stage pipe line technique is used to improve overall throughput of the system. The proposed system design is based on Xil inx’s Virtex FPGA chip and Texas Instruments DSP processor. The dual port BRAM memory in FPGA chip and EMIF (External Memory Interface) of DSP processor are used as interface between FPGA and DSP processor. The proposed system exploits advantages of both the computational elements (FPGA and DSP) and the system level pipelining to achieve real time perform ance. The present system implementation focuses on high accurate and high speed face detec tion and this system evaluated using standard BAO image database, which include images with different poses, orientations, occlusions and illumination. The proposed system attained 16.53 FPS frame rate for the input image spatial resolution of 640X480, which is 23.4 times faster detection of faces compared to MATLAB implementation and 12.14 times faster than DSP implementation and 2.1 times faster than FPGA implementation.
[1] Y. Lei, Z. Gang, R. Si-Heon, Lee Choon-Young, Lee Sang-Ryong and K. -M. Bae, "The Platform of Image Acquisition and Processing System Based on DSP and FPGA," 2008 International Conference on Smart Manufacturing Application, 2008, pp. 470-473, doi: 10.1109/ICSMA.2008.4505567.
[2] C. Kotropoulos and I. Pitas, "Rule-based face detection in frontal views," 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, pp. 2537-2540 vol.4, doi: 10.1109/ICASSP.1997.595305.
[3] D. Nguyen, D. Halupka, P. Aarabi and A. Sheikholeslami, "Real-time face detection and lip feature extraction using field-programmable gate arrays," in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 36, no. 4, pp. 902-912, Aug. 2006, doi: 10.1109/TSMCB.2005.862728.
[4] D. N. Arya, K. L. V. Sivanji, R. Reddy, S. Sivanantham, and K. Sivasankaran, “A face detection system implemented on FPGA based on RCT colour segmentation,” Proc. 2016 Online Int. Conf. Green Eng. Technol. IC-GET 2016, 2017, doi: 10.1109/GET.2016.7916781.
[5] H. Ben Fekih, A. E. B, and B. Juurlink, “An Efficient and Flexible FPGA Implementation of a Face Detection System,” pp. 243–254, 2015, doi: 10.1007/978-3-319-16214-0.
[6] H.-Y. Leung, L.-M. Cheng, and X. Y. Li, “A FPGA implementation of facial feature extraction,” J. Real-Time Image Process., vol. 10, no. 1, pp. 135–149, 2015, doi: 10.1007/s11554-012-0263-8.
[7] A. S. Kamewar, "Processing geospatial images using GPU," 2017 International Conference on Emerging Trends & Innovation in ICT (ICEI), 2017, pp. 27-32, doi: 10.1109/ETIICT.2017.7977005.
[8] J. Batlle, “A New FPGA/DSP-Based Parallel Architecture for Real-Time Image Processing,” Real-Time Imaging, vol. 8, no. 5, pp. 345–356, 2002, doi: 10.1006/rtim.2001.0273.
[9] K. L. Y. Li et al., “A new parallel particle filter face tracking method based on heterogeneous system,” J. Real-Time Image Process., vol. 7, no. 3, pp. 153–163, 2012, doi: 10.1007/s11554-011-0225-6.
[10] L. Guo, “An embedded multimedia communication terminal based on DSP+FPGA,” Multimed. Tools Appl., vol. 76, no. 16, pp. 16949–16961, 2017, doi: 10.1007/s11042-016-3597-6.
[11] Z. Ding, F. Zhao, T. Wang, W. Shu, and M.-Y. Wu, “Hecto-Scale Frame Rate Face Detection System for SVGA Source on FPGA Board,” 2011 IEEE 19th Annu. Int. Symp. Field-Programmable Cust. Comput. Mach., pp. 37–40, 2011, doi: 10.1109/FCCM.2011.16.
[12] Rein-Lien Hsu, M. Abdel-Mottaleb and A. K. Jain, "Face detection in color images," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 696-706, May 2002, doi: 10.1109/34.1000242.
[13] P. Viola and M. Jones, “Robust real-time face detection,” Int. J. Comput. Vis., vol. 57, no. 2, pp. 137–154, 2004, doi: 10.1023/B:VISI.0000013087.49260.fb. [14] F. Zhao, L. Yang, Y. Zhu, and P. Liao, “Ehancing the implementation of Adaboost algorithm on a DSP-based platform,” Int. Conf. Scalable Comput. Commun. - 8th Int. Conf. Embed. Comput. ScalCom-EmbeddedCom 2009, pp. 393–395, 2009, doi: 10.1109/EmbeddedCom-ScalCom.2009.77.
[15] Ganeswara Rao M.V., Panakala R.K., Mallikarjuna Prasad A. (2018) A New VLSI Architecture for Skin Tone Detection in an Uncontrolled Background. In: Anguera J., Satapathy S., Bhateja V., Sunitha K. (eds) Microelectronics, Electromagnetics and Telecommunications. Lecture Notes in Electrical Engineering, vol 471. Springer, Singapore.
[16] Fekih H.B., Elhossini A., Juurlink B. (2015) An Efficient and Flexible FPGA Implementation of a Face Detection System. In: Sano K., Soudris D., Hübner M., Diniz P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science, vol 9040. Springer, Cham.
[17] Dong Zhang, S. Z. Li and D. Gatica-Perez, "Real-time face detection using boosting in hierarchical feature spaces," Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Cambridge, 2004, pp. 411-414 Vol.2.
[18] Y. N. Chae, T. Han, Y.-H. Seo, and H. S. Yang, “An efficient face detection based on color-filtering and its application to smart devices,” Multimed. Tools Appl., vol. 75, no. 9, pp. 4867–4886, 2016, doi: 10.1007/s11042-013-1786-0.
[19] C. Kumar and M. S. Azam, “A multi-processing architecture for accelerating Haar-based face detection on FPGA,” 9th Int. Conf. Ind. Inf. Syst. ICIIS 2014, 2015, doi: 10.1109/ICIINFS.2014.7036525.
[20] S. Liao, A. K. Jain, and S. Z. Li, “A Fast and Accurate Unconstrained Face Detector,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 211–223, 2016, doi: 10.1109/TPAMI.2015.2448075.
[21] A. N. Rajagopalan, K. S. Kumar, J. Karlekar, R. M. M. M. Patil, U. B. Desai, and P. G. P. S. Chaudhuri, “Finding Faces in Photographs,” IEEE Int. Conf. Comput. Vis., no. 1, pp. 640–645, 1998, doi: 10.1109/ICCV.1998.710785.
[22] M. S. Lew, "Information theoretic view-based and modular face detection," Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, VT, USA, 1996, pp. 198-203.
[23] A. J. Colmenarez and T. S. Huang, “Face Detection With Informat ion- Based Maximum Discrimination,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 782–787, 1997, doi: http://dx.doi.org/10.1109/CVPR.1997.609415.
[24] K. S. Park, R. H. Park, and Y. G. Kim, “Face detection using the 3x3 block rank patterns of gradient magnitude images and a geometrical face model,” Dig. Tech. Pap. - IEEE Int. Conf. Consum. Electron., no. c, pp. 793–794, 2011, doi: 10.1109/ICCE.2011.5722867.
[25] P. P. Paul and M. Gavrilova, “PCA based geometric modeling for automatic face detection,” Proc. - 2011 Int. Conf. Comput. Sci. Its Appl. ICCSA 2011, pp. 33–38, 2011.
[26] A. Majumder, L. Behera, and V. K. Subramanian, “Automatic and robust detection of facial features in frontal face images,” Proc. - 2011 UKSim 13th Int. Conf. Model. Simulation, UKSim 2011, pp. 331–336, 2011, doi: 10.1109/UKSIM.2011.69.
[27] J. Guo, C. Lin, M. Wu, C. Chang and H. Lee, "Complexity Reduced Face Detection Using Probability-Based Face Mask Prefiltering and Pixel-Based Hierarchical-Feature Adaboosting," in IEEE Signal Processing Letters, vol. 18, no. 8, pp. 447-450, Aug. 2011.
[28] Katkoori Arun Kumar and Ravi Boda, “A Threshold-based Brain Tumour Segmentation from MR Images using Multi-Objective Particle Swarm Optimization,” Journal of Information Systems and Telecommunication, Vol. 9, No. 4, 2021, pp. 218–225.
[29] Hamed Agahi and Kimia Rezaei, “An Automatic Thresholding Approach to Gravitation-Based Edge Detection in Grey-Scale Images,” Journal of Information Systems and Telecommunication, Vol. 9, No. 4, 2021, pp. 285–296.
[30] K. Li, Y. Tian, B. Wang, Z. Qi, and Q. Wang, "Bi-Directional Pyramid Network for Edge Detection," Electronics, vol. 10, no. 3, 2021, p. 329-333.
[31] D. Wang, J. Yin, C. Tang, X. Cheng, and B. Ge, "Color edge detection using the normalization anisotropic Gaussian kernel and multichannel fusion," IEEE Access, vol. 8, 2020, pp. 228277-228288,.
[32] Azamossadat Nourbakhsh, Mohammad-Shahram Moin and Arash Sharifi, “Facial Images Quality Assessment based on ISO/ICAO Standard Compliance Estimation by HMAX Model,” Journal of Information Systems and Telecommunication, Vol. 7, No. 27, 2009, pp. 225–237.
[33] Azar Mahmoodzadeh, “Human Activity Recognition based on Deep Belief Network Classifier and Combination of Local and Global Features,” Journal of Information Systems and Telecommunication, Vol. 9, No. 36, 2021, pp. 45–52.
http://jist.acecr.org ISSN 2322-1437 / EISSN:2345-2773 |
Journal of Information Systems and Telecommunication
|
A High Performance Dual Stage Face Detection Algorithm Implementation using FPGA Chip and DSP Processor |
M V Ganeswara Rao1*, P Ravi Kumar1 , T Balaji2
|
1. Department of Electronics and Communication Engineering, Shri Vishnu Engineering College for Women, Bhimavaram, AP, India. 2. Department of Electronics and Communication Engineering PVP Siddhartha Institute of Technology, Vijayawada, AP, India |
Received: 12 Oct 2021/ Revised: 01 Jan 2022/ Accepted: 01 Feb 2022 |
|
Abstract
A dual stage system architecture for face detection based on skin tone detection and Viola and Jones face detection structure is presented in this paper. The proposed architecture able to track down human faces in the image with high accuracy within time constrain. A non-linear transformation technique is introduced in the first stage to reduce the false alarms in second stage. Moreover, in the second stage pipe line technique is used to improve overall throughput of the system. The proposed system design is based on Xilinx’s Virtex FPGA chip and Texas Instruments DSP processor. The dual port BRAM memory in FPGA chip and EMIF (External Memory Interface) of DSP processor are used as interface between FPGA and DSP processor. The proposed system exploits advantages of both the computational elements (FPGA and DSP) and the system level pipelining to achieve real time performance. The present system implementation focuses on high accurate and high speed face detection and this system evaluated using standard BAO image database, which include images with different poses, orientations, occlusions and illumination. The proposed system attained 16.53 FPS frame rate for the input image spatial resolution of 640X480, which is 23.4 times faster detection of faces compared to MATLAB implementation and 12.14 times faster than DSP implementation and 2.1 times faster than FPGA implementation.
Keywords: Face detection; Heterogeneous System; FPGA; DSP.
1- Introduction
The image processing algorithms with real time performance and high accuracy are idly used in diversified fields such as surveillance, surface quality inspection, Robotic vision, Assistive technology etc. [1]. The human face detection is one of the popular research areas in the field of image processing. These algorithms are used in many applications like facial recognition in security systems, human computer Interaction (HCI) and so on. In the past, many researchers proposed face detection algorithms, which are computationally simple, but not efficient. However, in the recent past, researchers proposed highly efficient methods, but demanding high computational power. Development of such a computational system becomes a challenging task.
In recent years, revolutionary advancements in computational platform after merge of high performance Digital Signal Processors (DSPs), Graphics Processing Units(GPUs), Application Specific Instruction set Processors (ASIPs) and Field Programmable Gate Array (FPGA) Chips. Each computing elements has its own advantages that make it used in associated application. Many researcher developed by hardware platforms based on single computing element to track the human faces in image.
Yang, et al. [2] Proposed face detection system based on DSP Processor, Patrick, et al. [1] developed DSP based hardware to compress video frames for wireless
transmission, In [3] Nguyen et. al. implemented an optimized algorithm for video segmentation on DSP platform, Arya, et al. [4] proposed face detection(using RTC colour model) system based on Quartus II FPFA, Fekih, et al. ) [5]presented new hardware architecture for face detection based on Zynq-7000 SoC, in which ueses ARM CPU and FPGA as computational elements, Leung, et al. [6] proposed a FPGA platform to extract facial features from images for facial recognition and results proved that detection rate and the performance are significantly high compared to Desktop implementation and Karnewar, et al. [7] proposed GPU based image processing platform to process geospatial images (CUDA application).
The state of the art Face detection algorithms are demanding a very high computing performance, which is very hard to achieve with single and homogeneous computing elements. One recent approach to meet this performance demand is to use heterogeneous systems formed by interconnecting a no of heterogeneous computing elements to build a huge computational platforms. These platforms are widely used in performance intense applications such as image processing and Medical instrumentation etc. Simple heterogeneous system architecture is presented in Fig. 1.
Fig. 1 Simple Heterogeneous structure (example)
In this paper, a new custom heterogeneous face detection platform based FPGA and DSP is proposed. The advantage of this system lies in its two stage systems level pipelining, which is used to attain practical performance. The paper is organized as follows: Section 2 introduces various Heterogeneous platforms proposed by the researchers. Section 3 describes about Proposed Heterogeneous platform architecture. In Section 4, the overview of face detection algorithm is presented. Section 5 deals with implementation of face detection algorithm on FPGA and DSP Processor. Section 6 provides experimental results and comparison with previous works. Section 7 presents conclusion remarks.
Fig. 2 Block diagram proposed face detection system |
2- Allied Work
Considerable research is going on, in the past one and half decade to design and development of Heterogeneous system for real time image processing. Batlle, et al. [8]. Proposed a high performance image processing system architecture based on FPGA chip and DSP processor. This system consists of array of DSP processor and the FPGA. The FPGA is used to interconnect these processors.
Liu, et al. [9] proposed multi core GPPs and one GPU (Graphic Processing unit) based heterogeneous platform to tracking human faces in images. The face tracking cannot stable with single information of the face, due to occlusion and illumination problems. Three dissimilar information, wavelet feature, colour histogram and edge orientation histogram are combined to significantly improve the face tracking performance[28][29].
Guo et al. [10] implemented video image correlation algorithm on DSP and FPGA based platform and also multimedia processing algorithms on proposed platform. This system is based on DSP (Multimedia) and FPGA chip. The functions video and audio gathering are implemented on DSP and the FPGA chip is responsible for VGA display, control logic etc.
Wei et al. [11] designed and developed Image processing platform rely on three DSPs and one FPGAs to attain real performance. The EMIF of DSPs is used as to interface DSPs and FPGA to built heterogeneous platform. In the proposed architecture DSPs are used to process core multimedia and FPGA controls data flow in the system. This system exhibits high processing power at a cost of complexity.
3- System Architecture and Overview
This section deals with architecture of proposed system, External Memory Interface (EMIF) of DSP processor and Interface design
3-1- Hardware Architecture
The proposed system architecture for face detection implemented using with Xilinx’s Virtex 4VSX35 FPGA and Texas Instruments DSP TMS320DM642 is shown in Fig. 2. In the first stage, Xilinx’s Virtex 4VSX35 FPGA is used to implement hardware architecture of skin tone detector and two BRAMs to store the image data before and after skin tone detection. There is a large Block RAM resource (3,456KB) in Virtex 4VSX35 FPGA, which is adequate to implement BRAM. This FPGA also offers 18 x 18, two’s complement, signed Multiplier and automatic programmable FIFO logic along with other advanced features [Xilinx 4VSX35 Data Sheet, 2010, p.3].
zIn the second stage, Texas Instruments (TI) TMS320DM642 is used as a computing element to implement Viola and Jones face detection algorithm and to set up other modules. This processor is a fixed point, high performance digital media processor and delivers the performance up to 5760 MIPS at a clock frequency of 720MHz. The TMS320DM642 provides EMIF (External/Memory Interface) services, which allows the external memory controllers connects to processors.
[TMS320DM642 Technical Overview, 2017, p. 30][14]. The EMIF of TMS320DM642 are used to provide interface between the FPGA platform and DSP platform. The MTYPE (Memory Type) field of the CE3 space control register of TMS320DM642 is configured as an asynchronous RAM interface and 32 bit image data.
3-2- EMIF Overview
The External Memory Interface (EMIF) of TMS320DM642 supports interfaces to many external peripherals such as Asynchronous devices (example SRAM, and FIFOs) and Synchronous DRAM (SDRAM). The EMIF can be used as EMIFA supports data bus width 64 or 32 bits and EMIFB supports 16 bits. The signals of EMIFA are presented in Fig. 3
Fig. 3 Signal description of EMIFA
To configure EMIF according to requirement, some registers must be set with the required values. The TI TMS320DM642 memory address space is used to configure various interfaces of EMIF. The base address for EMIFA and EMIFB are 0x0180_0000 and 0x01A8_0000 respectively. The 4 bit CE control Registers CECTL corresponds to the four memory spaces of EMIF. The MTYPE is key filed in the CECTL register, which defines the type of memory interfacing to corresponding memory space.
3-3- Interface Design
There are two ways to interface FPGA to DSP, one way is to use dual port BRAM, and other one is FIFO. In this design, dual port synchronous BRAM used to interface with ports of EMIF
This interface utilizes the Virtex -4 IOB, which is configured as simple input and tri-state buffer. This memory based interface between FPGA and DSP significantly reduce the interface logic to obtain maximum image data throughput. The EMIF of TMS320DM642 uses the memory configured in FPGA as a memory system with 32 bit data word length and 31kb memory depth for image size of 640X480. The Virtex 4 Device offers a large number of Block RAMs with size of 18Kb. However, these blocks can be interconnected to build wider and deeper memory systems. The 18Kb BRAM is a dual port RAM with 18Kb memory space and two ports A and B are completely independent. Data can be written and read on both ports simultaneously and each port has its own data lines, address lines and control lines (See Fig. 4).
This interface uses BRAM as a memory and FPGA I/O pins as a physical connection between EMIF of DSP and FPGA. To implement 32kb X 32 bit dual port BRAM (see Fig. 4), a set of 8 BRAMs blocks are configured as a 32 bit wide and 32KB deep True dual port RAM memory. Port A is used as the access port for EMIF of DSP and Port B configured to act as contact port for FPGA.
Fig. 4 18KB BRAM in Vertex 4 Device
4- Face Detection Theory & Algorithm
A hybrid face detection algorithm based on skin tone detection and Viola and Jones face detection structure implemented on Heterogeneous platform, which is on discussed in the previous section.
Face Detection Overview
In Recent days, Human face recognition plays critical role in the automation of various processes in this technologically advanced world. The fundamental step in the human face recognition algorithm is, to detect whether image consists of face or not. If detected, the region of the human face is estimated (see Fig. 5). The difficulty with face detection greatly related with pretence, the existence of structural items, Facial expressions, Image orientation, Occlusion etc.
Fig. 5 Face Detection System
The still image face detection algorithms broadly divided into four distinct methods. a. Knowledge based methods: In this methods human knowledge about the face and what a typical face consists are used to define a convention to obtain relation between facial features (Yang, et al. 1994). b. Feature invariant methods: in these methods various structural features such as colour, shape, texture and other local feature, which are invariant even with changes in illumination condition, pose and viewpoint are used to detect faces in images (Leung, et al. 1995; Dai and Nakano, 1996; McKenna, et al. 1998; Kjeldsen and Kender, 1996). c. Template matching methods: un like other methods, in this various regular arrangements of human face collected and stored in the template database. These templates are used to correlate with input image to detect faces in the image (Craw, et al. 1992; Lanitis, et al. 1995). d. Appearance based methods: these type of algorithms relies on the large image database, which comprise a huge range of human faces with numerous variations. Support Vector Machine (SVM) (Osuna and Girosi, 1997) and Neural Networks (Rowley, et al. 1998) are the most commonly used techniques in this category
4-1- Face detection Algorithm
The proposed two stage face detection algorithm based on skin tone and Viola and Jones face detection structure is shown in Fig. 6. In the first stage, the input image, which is in the RGB colour model converted into YCrCb model and skin patches are segmented in the input image. The second stage, extract facial features from skin segmented image and detect faces in the image using Viola and Jones face detection algorithm.
Fig. 6 Face Detection Algorithms
4-1-1 RGB to YCrCb Colour Model Conversion
Colour model is a mathematical representation of colour in terms of three or four components. The different colour models are used based on applications such as processing of digital image data, Display, transmission and TV broadcasting. There are several colour models are proposed and some most popular colour models are RGB, HSI, HSV, HSL, YIQ, YCrCb and YUV.
RGB (Red Green Blue) Colour model is most commonly used colour model to represent digital images. In this any colour is represented by three primary colours Red, Green and Blue based on how much percentage taken from each component. The skin tone detection based on the RGB colour model not preferred because of the high correlation between chrominance values and illumination value (Jones and Rehg, 2002) [12]. The normalized RGB can obtain from eq. [1-3].
(1)
(2) (3)
In YCrCb colour model, Y components represent the luminance information and Cr and Cb represent chrominance information of image pixels. This colour model is most commonly used model because luminance and chrominance components are highly independent. YCrCb value can be obtained from the RGB model according to eq. 4.
(4)
4-1-2 Skin Tone Detection
Skin tone detection efficiency largely depends on choices of appropriate colour model and clustering of skin pixels in that colour model. We selected YCrCb colour model since it is seems identical space and also adopted in very popular video compression standers such MPEG and JPEG. Many researchers have reported that Cr and Cb values of skin pixel are uncorrelated with the Y value of the pixel. But, in practical, skin tone is nonlinearly depends on luma component. Some researchers demonstrated that detecting skin tone in CrCb and Cb/Y –Cr/Y sample subspace results in many false positives and false negatives respectively. Therefore, the Rein-Lien et al., (2002) proposed a nonlinear transformation of YCrCb in order to make the skin colour space independent of Luma component (eq 5-10).
(5)
(6)
(7)
(8)
(9)
(10)
The transformed Chroma components [TransCr, TransCb] space that represent elliptical model is described by eq. 11-12.
(11)
4-1-3 Face Detection using Adaboost
The basic structure proposed by Viola and Jones [13] used to solve the face detection problem. In this framework, facial features have used instead of the pixel directly since feature based operations are much quicker than pixel based operations. Many researchers reported very detailed versions of this algorithm; hence we are presenting very little information about this method. This approach is based on Haar-like feature. Given a set of features and training based on face images and non-face images, adaboost algorithm can choose best single Haar-like feature which isolate face and non-face images and strong classifier are formed by combining weak classifiers. These strong classier are cascaded to detect faces in images. The no of stages required for face detection depends on accuracy and speed requirement. In the learning stage, after each round, weights of images which were correctly judged by the preceding weak classifier are enhanced.
5- Implementation
The proposed heterogeneous system consists of FPGA and DSP processor. The skin tone detection algorithm implemented on FPGA and face detection algorithm on skin segmented image has been implemented on DSP Processor. The hardware connects of proposed system shown in Fig. 7.
Fig. 7 Proposed System Architecture
5-1- Implementation of Skin Tone Detection
The proposed skin tone detection algorithm has been implemented on Xilinx’s Virtex 4VSX35 FPGA. The input image loaded into Block RAM of FPGA and a block RAM controller realized to read the Block RAMs.
The transformed Chroma values transCr and TransCb are used to find out skin score each pixel as shown in Fig. 8. After processing, skin segmented image loaded into dual port RAM, which is later read by Texas Instruments (TI) TMS320DM642 DSP processor using EMIF to detect human faces in skin segmented image.
Fig. 8 Skin Tone Detection Hardware flow
5-2- DSP Implementation of Face Detection
We have chosen TMS320DM642 DSP platform to implement face detection algorithm, which offers a low cost solution for high performance requirements and it offers speed up to 4800 MIPS (Million Instruction per Second) at a clock frequency of 600 MHz.
In the heterogeneous system, TMS320DM642 DSP processor reads skin segmented image from dual port BRAM configured in FPGA using EMIF facility provided in TMS320DM642 DSP processor. The face detection algorithm based on Viola and Jones face detection structure is successfully implemented on the target TMS320DM642 DSP processor. The Face detection algorithm is implemented in MATLAB R2015a and converted to C code using the MATLAB coder facility and CCS (Code Composer Studio) project is implemented using generated code.
6- Experiment Results
The proposed system heterogeneous face detection system architecture consists of two modules. The first module read the RGB images from Block RAM and generates skin segmented images. The second module, read the skin segmented image from first module and localize the face in the skin regions of the image.
In feature classification stage, it is need to generate sum of pixels in a various rectangular area for haar feature classifications. Different sizes of rectangular area require different time to compute the sum of the pixels in the rectangular area. An integral image used to reduce the computational time of sum of the pixels under rectangle. By using integral image, area under all sized rectangle are computed at constant time with two adders and one subtractor. The state of the art implementations convert entire image (640X420) it’s required 270MB of RAM to hold the 640X420 size integral image. However, in the proposed system integral image generated only for sub-image (24X24) and it required 580 bytes of RAM to hold 24X24 sized integral image. It intern reduced the memory requirement of the overall system.
The performance of the complete system is tested by using Bio Database [Image Data Base, 2017], which includes images with single and multiple faces with different pose, orientation, occlusions and illumination.
The database images are resized to 640X480 and pixel size of 24 bits (8 bits for each RGB component). We have implemented proposed skin tone detector architecture on Xilinx’s Virtex 4 VSX35 FPGA and results are presented . Table 1
. Table 1 Synthesis Results of the proposed architecture implemented on Xilinx’s Virtex 4VSX35 FPGA
Parameter | Value |
No of clock cycles | 264354 |
Execution time at 5Mhz clock | 4.8ms |
Execution time at 120Mhz clock | 1.8ms |
Hardware utilization | 982 Logical elements |
Core Power Dissipation | 90.2mW |
The detection performances of the proposed system, are presented in Error! Reference source not found.. The proposed dual stage face detection system achieved a very high detection rate of 94.5% and performance of 13.1 FPS for image resolution of 640X480. The performance comparisons of our system with some existing systems are reported in Table 3
Table 2 face detection rates of proposed system
Type of the image (With variations) | No of images | No of faces in the image | Accuracy | |
|
| No of detected faces | Detection rate | |
Single | 150 | 150 | 139 | 92.66% |
Group | 220 | 1230 | 1173 | 95.36% |
Table 3 Performance Comparison of proposed Architecture
Detection time(ms) | Frame Per Second (FPS) | |
FPGA+DSP (Proposed system) | 112 | 13.0 |
FPGA (Fekih, et. al., 2015) [5] | 160 | 6.18 |
DSP Processor (Zhao, et.al., 2009) [14] | 940 | 1.07 |
PC (MATLAB) | 1438 | 0.69 |
7- Conclusion
In this paper new hardware architecture for real face detection is presented. This implementation allows the user to choose image resolution and speed with available resources in the FPGA. The current implementation is based The Xilinx Starter Kit Virtex-4 SX35 Starter Kit and XEVM642 Development Kit powered by a TI TMS320DM642 DSP processor. The proposed system achieved 13.7 FPS average frame rate, when tested with an images with a spatial resolution of 640X480. This system exhibits performance improvement of 2.12 times compared with equivalent FPGA implementation, 12.3 times compared to DSP implementation and 18.98 times compared to PC implementation to solve the real time performance problem. The proposed hardware architecture achieved average detection accuracy of 94.5%, which low compared to the implementation on PC (97%), since the low accuracy pixel data are used in FPGA hardware architecture.
Future Scope
In the feature, this hybrid architecture can be extended to design high performance facial recognition system by modifying the second stage of the proposed system.
Acknowledgments
Supported by VLSI Lab, Dept. of ECE, Shri Vishnu Engineering College for women
References
[1] Y. Lei, Z. Gang, R. Si-Heon, Lee Choon-Young, Lee Sang-Ryong and K. -M. Bae, "The Platform of Image Acquisition and Processing System Based on DSP and FPGA," 2008 International Conference on Smart Manufacturing Application, 2008, pp. 470-473, doi: 10.1109/ICSMA.2008.4505567.
[2] C. Kotropoulos and I. Pitas, "Rule-based face detection in frontal views," 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, pp. 2537-2540 vol.4, doi: 10.1109/ICASSP.1997.595305.
[3] D. Nguyen, D. Halupka, P. Aarabi and A. Sheikholeslami, "Real-time face detection and lip feature extraction using field-programmable gate arrays," in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 36, no. 4, pp. 902-912, Aug. 2006, doi: 10.1109/TSMCB.2005.862728.
[4] D. N. Arya, K. L. V. Sivanji, R. Reddy, S. Sivanantham, and K. Sivasankaran, “A face detection system implemented on FPGA based on RCT colour segmentation,” Proc. 2016 Online Int. Conf. Green Eng. Technol. IC-GET 2016, 2017, doi: 10.1109/GET.2016.7916781.
[5] H. Ben Fekih, A. E. B, and B. Juurlink, “An Efficient and Flexible FPGA Implementation of a Face Detection System,” pp. 243–254, 2015, doi: 10.1007/978-3-319-16214-0.
[6] H.-Y. Leung, L.-M. Cheng, and X. Y. Li, “A FPGA implementation of facial feature extraction,” J. Real-Time Image Process., vol. 10, no. 1, pp. 135–149, 2015, doi: 10.1007/s11554-012-0263-8.
[7] A. S. Kamewar, "Processing geospatial images using GPU," 2017 International Conference on Emerging Trends & Innovation in ICT (ICEI), 2017, pp. 27-32, doi: 10.1109/ETIICT.2017.7977005.
[8] J. Batlle, “A New FPGA/DSP-Based Parallel Architecture for Real-Time Image Processing,” Real-Time Imaging, vol. 8, no. 5, pp. 345–356, 2002, doi: 10.1006/rtim.2001.0273.
[9] K. L. Y. Li et al., “A new parallel particle filter face tracking method based on heterogeneous system,” J. Real-Time Image Process., vol. 7, no. 3, pp. 153–163, 2012, doi: 10.1007/s11554-011-0225-6.
[10] L. Guo, “An embedded multimedia communication terminal based on DSP+FPGA,” Multimed. Tools Appl., vol. 76, no. 16, pp. 16949–16961, 2017, doi: 10.1007/s11042-016-3597-6.
[11] Z. Ding, F. Zhao, T. Wang, W. Shu, and M.-Y. Wu, “Hecto-Scale Frame Rate Face Detection System for SVGA Source on FPGA Board,” 2011 IEEE 19th Annu. Int. Symp. Field-Programmable Cust. Comput. Mach., pp. 37–40, 2011, doi: 10.1109/FCCM.2011.16.
[12] Rein-Lien Hsu, M. Abdel-Mottaleb and A. K. Jain, "Face detection in color images," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 696-706, May 2002, doi: 10.1109/34.1000242.
[13] P. Viola and M. Jones, “Robust real-time face detection,” Int. J. Comput. Vis., vol. 57, no. 2, pp. 137–154, 2004, doi: 10.1023/B:VISI.0000013087.49260.fb.
[14] F. Zhao, L. Yang, Y. Zhu, and P. Liao, “Ehancing the implementation of Adaboost algorithm on a DSP-based platform,” Int. Conf. Scalable Comput. Commun. - 8th Int. Conf. Embed. Comput. ScalCom-EmbeddedCom 2009, pp. 393–395, 2009, doi: 10.1109/EmbeddedCom-ScalCom.2009.77.
[15] Ganeswara Rao M.V., Panakala R.K., Mallikarjuna Prasad A. (2018) A New VLSI Architecture for Skin Tone Detection in an Uncontrolled Background. In: Anguera J., Satapathy S., Bhateja V., Sunitha K. (eds) Microelectronics, Electromagnetics and Telecommunications. Lecture Notes in Electrical Engineering, vol 471. Springer, Singapore
[16] Fekih H.B., Elhossini A., Juurlink B. (2015) An Efficient and Flexible FPGA Implementation of a Face Detection System. In: Sano K., Soudris D., Hübner M., Diniz P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science, vol 9040. Springer, Cham
[17] Dong Zhang, S. Z. Li and D. Gatica-Perez, "Real-time face detection using boosting in hierarchical feature spaces," Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Cambridge, 2004, pp. 411-414 Vol.2.
[18] Y. N. Chae, T. Han, Y.-H. Seo, and H. S. Yang, “An efficient face detection based on color-filtering and its application to smart devices,” Multimed. Tools Appl., vol. 75, no. 9, pp. 4867–4886, 2016, doi: 10.1007/s11042-013-1786-0.
[19] C. Kumar and M. S. Azam, “A multi-processing architecture for accelerating Haar-based face detection on FPGA,” 9th Int. Conf. Ind. Inf. Syst. ICIIS 2014, 2015, doi: 10.1109/ICIINFS.2014.7036525.
[20] S. Liao, A. K. Jain, and S. Z. Li, “A Fast and Accurate Unconstrained Face Detector,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 211–223, 2016, doi: 10.1109/TPAMI.2015.2448075.
[21] A. N. Rajagopalan, K. S. Kumar, J. Karlekar, R. M. M. M. Patil, U. B. Desai, and P. G. P. S. Chaudhuri, “Finding Faces in Photographs,” IEEE Int. Conf. Comput. Vis., no. 1, pp. 640–645, 1998, doi: 10.1109/ICCV.1998.710785.
[22] M. S. Lew, "Information theoretic view-based and modular face detection," Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, VT, USA, 1996, pp. 198-203.
[23] A. J. Colmenarez and T. S. Huang, “Face Detection With Informat ion- Based Maximum Discrimination,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 782–787, 1997, doi: http://dx.doi.org/10.1109/CVPR.1997.609415.
[24] K. S. Park, R. H. Park, and Y. G. Kim, “Face detection using the 3x3 block rank patterns of gradient magnitude images and a geometrical face model,” Dig. Tech. Pap. - IEEE Int. Conf. Consum. Electron., no. c, pp. 793–794, 2011, doi: 10.1109/ICCE.2011.5722867.
[25] P. P. Paul and M. Gavrilova, “PCA based geometric modeling for automatic face detection,” Proc. - 2011 Int. Conf. Comput. Sci. Its Appl. ICCSA 2011, pp. 33–38, 2011.
[26] A. Majumder, L. Behera, and V. K. Subramanian, “Automatic and robust detection of facial features in frontal face images,” Proc. - 2011 UKSim 13th Int. Conf. Model. Simulation, UKSim 2011, pp. 331–336, 2011, doi: 10.1109/UKSIM.2011.69.
[27] J. Guo, C. Lin, M. Wu, C. Chang and H. Lee, "Complexity Reduced Face Detection Using Probability-Based Face Mask Prefiltering and Pixel-Based Hierarchical-Feature Adaboosting," in IEEE Signal Processing Letters, vol. 18, no. 8, pp. 447-450, Aug. 2011.
[28] Katkoori Arun Kumar and Ravi Boda, “A Threshold-based Brain Tumour Segmentation from MR Images using Multi-Objective Particle Swarm Optimization,” Journal of Information Systems and Telecommunication, Vol. 9, No. 4, 2021, pp. 218–225.
[29] Hamed Agahi and Kimia Rezaei, “An Automatic Thresholding Approach to Gravitation-Based Edge Detection in Grey-Scale Images,” Journal of Information Systems and Telecommunication, Vol. 9, No. 4, 2021, pp. 285–296.
[30] K. Li, Y. Tian, B. Wang, Z. Qi, and Q. Wang, "Bi-Directional Pyramid Network for Edge Detection," Electronics, vol. 10, no. 3, 2021, p. 329-333.
[31] D. Wang, J. Yin, C. Tang, X. Cheng, and B. Ge, "Color edge detection using the normalization anisotropic Gaussian kernel and multichannel fusion," IEEE Access, vol. 8, 2020, pp. 228277-228288,.
[32] Azamossadat Nourbakhsh, Mohammad-Shahram Moin and Arash Sharifi, “Facial Images Quality Assessment based on ISO/ICAO Standard Compliance Estimation by HMAX Model,” Journal of Information Systems and Telecommunication, Vol. 7, No. 27, 2009, pp. 225–237.
[33] Azar Mahmoodzadeh, “Human Activity Recognition based on Deep Belief Network Classifier and Combination of Local and Global Features,” Journal of Information Systems and Telecommunication, Vol. 9, No. 36, 2021, pp. 45–52.
* MV Ganeswara Rao
ganesh.mgr@gmail.com