In this interview, Dr. Thirimachos Bourlai, an expert at the forefront of face recognition technology, shares his opinions regarding the challenges of night time facial recognition systems. He also sheds light on the advantages and disadvantages of using visible or infrared sensors for practical facial recognition applications and scenarios. Finally, he briefly outlines the available modalities and developments in long-range human identification technology that can be used in the future to deal with problems of recognizing unfamiliar faces in still image and video biometrics.
Q: Could you give some details on the challenges in identifying humans at night with facial recognition?
A: Most face recognition systems depend on the usage of face images captured in the visible range of the electromagnetic spectrum, i.e. 380-750 nm. However, in real-world scenarios (military and law enforcement) we have to deal with harsh environmental conditions characterized by unfavorable lighting and pronounced shadows. Such an example is a night-time environment , where human recognition based solely on visible spectral images may not be feasible .
In order to deal with such difficult FR scenarios, multi-spectral camera sensors are very useful because they can image day and night . Thus, recognition of faces across the infrared spectrum has become an area of growing interest [2-16]. Here is an example – realistic scenario – where it is very challenging to identify humans at night with facial recognition. Consider an individual walking towards (approaching the entrance of) a military facility. A face image needs to be captured by the surveillance cameras covering the facility that can be used for identification. The main challenges are (i) Data management: the raw, relevant (when the human is within the field of view), video footage (that can be several Mbytes or TBytes per day) needs to be narrowed down to information pertinent to the human and his/her face. At this point, applying efficient face tracking and eye detection techniques is very important [14, 15]. (ii) Data quality: improving the quality of the available face images , (iii) Face matching: applying state-of-the-art face matching techniques to perform identification [8, 16]. There is also another major challenge and that is face spoofing but this can be covered in another discussion.
Q: Could you frame the main challenges that are person-related, device-related and facial recognition software-based? Any other challenges?
A: There are various challenges with regards to night time facial recognition technologies. I would narrow the challenges down into four main categories:
1. Person-related: variations in pose, expression, including illumination that depends on the operational environment. For example, in certain night time scenarios there may not be sufficient ambient light to capture good quality photos. That is why the selection of sensors (e.g. infrared or other combinations) plays a very important role.
2. Device-related: using different camera sensors such as (i) cameras operating at different spectra vs. (ii) very expensive, high-end vs. low cost surveillance cameras.
3. Related to FR matching software used: (i) commercial software packages, in which the operator cannot access, know and/or change internal algorithmic functionalities (e.g. image restoration algorithms applied to raw data), vs. (ii) academic FR software packages, where the operators have access to the code and thus, can change, upgrade and improve the software (face matching) capabilities.
4. Related to other factors, such as image quality (e.g., image resolution, compression, blur), time span (facial aging), occlusion, and demographic information (e.g., gender, race/ethnicity, or age). For example, a face recognition system will behave differently when it is trained and tested using a certain cohort (such as a race group) or when using different cohorts.
Of course, the biggest challenge of all is the combination of the above challenges!
Q: What are the advantages and disadvantages of using visible or infrared sensors?
A: This is a very challenging question mainly because it is very general but I am very happy that I can provide some insight into it. For example, someone could focus only on sensor technology (mainly hardware with sensors ranging from visible, Near Infrared and up to Long Wave Infrared) and someone else on spectral imaging and the information acquired from different sensors, including the software developed to process this information.
Regarding FR both visible and IR sensors are important. Visible sensors have the advantage that they are low cost and the spatial resolution can be much higher when compared to certain infrared sensors, especially short ware IR or cooled infrared (thermal) ones. The infrared (IR) spectrum is divided into different spectral bands based on the response of various detectors, i.e. the active IR and the thermal (passive) IR band. The active IR band (0.7-2.5µm) is divided into the NIR (near infrared) and the SWIR (short wave IR) spectrum. NIR has the advantage that we can see at night but the limitation is that an illuminator is required, which can be spotted (cannot covertly illuminate the scene). SWIR has a longer wavelength range than NIR and is more tolerant to low levels of obscurants like fog and smoke. Differences in appearance between images sensed in the visible and the active IR band are due to the properties of the object being imaged. The benefits of SWIR are discussed in . SWIR may pick up facial features that are not observed in the visible spectrum and can be combined with visible-light imagery to generate a more complete image of the human face. The SWIR range has only recently become practical for FR, particularly since the development of indium gallium arsenide sensors, which are designed to work well in night-time conditions. Another advantage is that the external light source that may be required for regions in the SWIR band can covertly illuminate the scene since it emits light invisible to the human eye.
The passive IR band is further divided into the Mid-Wave (MWIR) and the Long-Wave Infrared (LWIR) band. MWIR ranges from 3-5µm, while LWIR ranges from 7-14µm. Both MWIR and LWIR cameras can sense temperature variations across the face at a distance, and produce thermograms in the form of 2D images. The difference between MWIR and LWIR is that MWIR has both reflective and emissive properties, whereas LWIR consists primarily of emitted radiation. The benefit is that they are both almost completely impervious to external illumination. Another advantage is that they reveal different image characteristics of the facial skin. However, their limitations are that they are subject to variations in temperature in the surrounding environment, and to the variations of the heat patterns of the face that can be affected due to various factors, e.g. stress, changes in temperature of the surrounding environment, physical activity etc. The importance of MWIR in FR technology has been recently and first proposed by MILab .
I would also like to redirect to some very interesting an important articles [18-22] and web-links [23-24]:
(i) Regarding spectral imaging, one of the most interesting and informative articles I have come across is that of Shaw and Burke , i.e. spectral imaging on remote sensing. Also, an interesting tutorial on infrared imaging can be found at , while another interesting link on night vision systems (low light imaging, near infrared and thermal imaging) can be found at .
(ii) Regarding face recognition and the comparison of visible against other infrared bands, the original work of Wilder et al. stands out , while some very interesting articles that came several year after can be found here [18, 19, 21]. Of course, MILab at WVU has recently published many articles in this area (details can be found in MILAB’s publications [1-16]).
Q: Could you provide us with a review of the available systems and developments in long-range identification? What are the contributions of MILab’s on long range FR?
A: There are many available camera systems used for long range biometrics and surveillance applications. There are commercially available products (including software) provided by different companies including L3, FLIR etc. There are also other systems (hardware and software) designed and developed under a specific research project (e.g. the TINDERS project ). Either type of systems has been used by researchers to perform specific biometric related experiments. At MILab, for example, we used two different types of infrared systems for long range night time FR, i.e. a NIR-based  and a SWIR-based .
In the first case , we used a NIR sensor designed with the capability to acquire images at middle-range stand-off distances at night. Then, we determined the maximum stand-off distance where FR techniques can be utilized to efficiently recognize individuals at night at ranges from 30 to approximately 300 ft. The focus of the study was on establishing the maximum capabilities of the mid-range sensor to acquire good quality face images necessary for recognition. For the purpose of that study, a database in the visible (baseline) and NIR spectrum of more than 100 subjects was assembled and used to illustrate the challenges associated with the problem. In order to perform matching studies, we used multiple FR techniques and demonstrated that certain techniques are more robust in terms of recognition performance when using face images acquired at different distances. In  you can find the details about the camera system (hardware) and the challenging FR experiments performed.
In the second case , we used a SWIR camera from Sensors Unlimited (bulk) or within a developed optical system that can perform long range imaging at day and night . In that study, we investigated the problem of cross spectral FR in heterogeneous environments. Specifically, we investigated the advantages and limitations of matching SWIR (at 1550 nm) probe face images to visible (gallery) images acquired under variable scenarios: visible images were collected under controlled and semi-controlled conditions (full frontal faces, facial expressions, indoors and outdoors, short range, fixed standoff distance to 7 feet or 2 meters), while SWIR images were captured under (i) fully controlled indoor conditions; (ii) semi-controlled conditions (full frontal faces, indoors, long ranges, i.e., up to 348 feet or 106 meters); and (iii) uncontrolled conditions (variable poses, face expressions, occlusion, outdoors, night and day, variable range, i.e., up to 1312 feet or 400 meters). Three different matching/encoding algorithms were utilized, namely, Local Binary Patterns (LBP) and Local Ternary Patterns (LTP) , and a commercial face matcher. Our experimental results indicate that our proposed methodology (i.e., using a cross-photometric score level fusion scheme) performs better than baseline (single matchers before photometric normalization) cross-spectral FR performance, in the most challenging (uncontrolled) scenario described above. In  you can find the details about the camera system (hardware) and the challenging FR experiments performed.