Biometric systems for security have developed rapidly in recent years. Every biometric system has to uniquely identify the individual person’s identity based on their psychological or behavioural features. Compared with biometric systems like fingerprint, iris, and voice recognition, face recognition is more convenient and effective for users.

Face recognition is a low-cost and highly used biometric system because it requires a simple hardware device like an optical camera and a low computational algorithm

Liveness detection on face recognition is performed based on facial movements

In

In this work, a sparse optical flow-based liveness detection model is proposed. The proposed method analyses the difference in characteristics of real faces and photograph or 2D masks by detecting facial landmarks for each and every person and calculates the velocity for its optical flow field using an efficient velocity estimation approach.

The proposed face liveness detection model using SOFT_VEA consists of three phases. In the first phase, Facial Landmark Points are marked on the input video frames. Next, the points of interest such as eyes, nose, and mouth points are tracked using the Optical flow algorithm with a sparse approach to finding the motion between consecutive frames in line. Finally, an efficient Velocity estimation approach is utilized to find the impact of sudden and gradual changes in the velocity of landmark points. The impact is measured and using a threshold factor, the fake faces are identified against the real ones. The Proposed block diagram is shown in

For the trajectory operation, video input is converted to a sequence of images and 68 facial landmarks are detected for every frame. To find the facial landmarks, first, the system needs to locate the human face from the input and put a rectangle to it. By using the rectangle coordinates, face landmark points are detected.

Optical flow is to use the time-domain change and correlation of pixel intensity in image sequences to determine the ‘movement’ of each pixel’s location, that is, to study the relationship between the intensity change by time and the object’s structure and movement in the scene. Optical flow can be found by using two types, sparse optical flow, and dense optical flow. Sparse optical flow selects a set of feature points or pixels such as edges and corners to track its motion. While dense optical flow takes every pixel in each frame to track the motion.

Optical flow for the face is irregular with head motion and irregular facial expressions. The relative motion between the camera and object is of 4 types: rotation, transition, moving front and back, swing. All the other kinds of motions are the combination of these four motions. The representation of each motion varies with varying actions. The optical flow for rotation, transition, and moving front and back is similar for both real face and 2D masks. But the optical flow for swing (i.e. shaking, lowering, and raising head) is different for real and 2D faces.

Let us assume that, pixel intensities are constant for the optical flow,

Taylor series is used to expand the RHS of the above equation (1). By eliminating the common terms, optical flow equation is obtained as,

where a and b are unknown variables.

Solving one equation with two unknown variables cannot be done directly. So Lucas-Kanade method is introduced to solve this problem.

Lucas-Kanade method is a kind of sparse optical flow that used a corner detection algorithm to select the corner pixels. Instead of corner pixels, the landmark values are given to the Lucas-Kanade algorithm to find the flow of human facial landmarks. The selected landmark values are then passed to the optical flow functions. Assume that, all the neighbouring pixels to the landmark have similar motions. Consider the LK method taken a 3x3 patch from an image that contains 9-pixel values. So 9 equations need to be solved to determine the two unknown variables a and b.

To solve the above 9 equations, least square fitting needs to be applied. After applying least square fitting algorithm, the equation looks like two equation with two unknowns.

The above equation gives the movement of x and y over time t. Finally, the unknown variables a, and b are calculated by using the above equation. The resultant values from Lucas-Kanade algorithm is used to calculate the optical flow using equation 1.

Tracking of the image sequences gives the velocity. Velocity measures the distance traveled over change in time, so for the input video, the motion of the human face is calculated from the first frame to the last frame using Sparse Landmark Points where the points from Eyes, Nose, and Mouth regions are selected. Euclidean distance is used to measure the distance traveled by the human face from first to last frames. These specific points are useful in finding the 2D imposter faces and fake photo faces effectively. Thus, the velocity of eyes, nose, and mouth regions from consecutive frames are calculated using Euclidean distance as shown in equation 6.

The average velocity of selected landmark points against each frame is calculated and plotted as a graph. This graph shows the sudden and gradual changes that occur due to the ‘shaking’ action. For each and every video sample, the Critical Velocity (CV) value is calculated using the proposed formula as follows,

After finding the CV values, the number of peaks in the graph above the CV value is counted to distinguish the real face and masked faces. If the number of peaks is greater than the threshold (th), then the input video is a real face video. If the number of peaks is lesser than or equal then the input video is masked face video. Here, the threshold is assigned as 2.

1. avg_velocity[n] =0

2. Get the window size;

3.

x,y ← landmark points

a,b ← vector values

p1.x = p0.x+a;

p1.y = p0.y+b;

p0 → old frame;

p1 → new frame;

for i = 0 to len_ p1

4. plot avg_velocity; /*graph generation*/

5. temp = mean( max(avg_velocity));

6. cv = temp+(temp/2); /* critical velocity calculation*/

7.

count the peak points greater than cv in graph;

8.

return “Real face”

return “Fake/Masked face”

ROSE-YOUTU database is used in this proposed work. It contains around 150 videos of each from 20 different persons of the categories with glass, without glass, photo imposters or 2D masks, photo imposters or 2D masks with eyes and mouth open and video imposters. The length of the one video clip is about 10 to 15 seconds with 30 frames per second.

Proposed work has been implemented by Python 3.6 version along with Anaconda Library. OpenCV (version- 4.2.0) library is imported for handling videos and image files.

The experiment is conducted with 60 sample videos of three cases. Case 1 is the real faces of 20 different persons. Case 2 is 20 videos of persons with photo imposters or 2D masks. Case 3 is 20 videos of persons using photo imposter or 2D masks with eye and mouth openings.

For this work, out of 68 landmark points only 41 points i.e. eyes, nose, and mouth landmark points are selected. In

For the real human faces micro facial movements made low intensity value. But, for the 2D masked faces facial movements are purposely made by the humans which cause high intensity values. Velocity increases with increase in intensity, so velocity graph is used to differentiate real faces with 2D masked or fake faces.

File_name Critical velocity (cv) No. of peaks Inference Sample2_R 0.17 5 Real Sample3_R 3.61 2 Fake Sample4_R 3.64 3 Real Sample5_R 3.05 3 Real Sample6_R 3.88 5 Real Sample7_R 1.38 6 Real Sample9_R 0.93 9 Real

File_name |
Critical velocity (cv) |
No. of peaks |
Inference |

Sample2_IP |
1.42 |
2 |
Fake |

Sample3_IP |
0.19 |
5 |
Real |

Sample4_IP |
2.77 |
4 |
Real |

Sample5_IP |
3.02 |
2 |
Fake |

Sample6_IP |
4.06 |
1 |
Fake |

Sample7_IP |
1.09 |
2 |
Fake |

Sample9_IP |
3.33 |
1 |
Fake |

File_name |
Critical velocity (cv) |
No. of peaks |
Inference |

Sample2_IPC |
3.92 |
1 |
Fake |

Sample3_IPC |
1.09 |
4 |
Real |

Sample4_IPC |
2.43 |
2 |
Fake |

Sample5_IPC |
2.96 |
1 |
Fake |

Sample6_IPC |
1.21 |
1 |
Fake |

Sample7_IPC |
12.6 |
1 |
Fake |

Sample9_IPC |
2.33 |
2 |
Fake |

From

From the above equation, the HTER value for the proposed optical flow algorithm using facial landmark detection is calculated as 2.45.

Methods |
Accuracy (%) |
HTER |

DTLP |
80 |
6.1 |

Optical flow fields |
83 |
4 |

Proposed SOFT_VEA |
88 |
2.45 |

DLTP (Dynamic Local Ternary Pattern) is used to analyse the skin textures for finding out the masked faces among real faces

This work proposed the Sparse Optical Flow Method with Lucas Kanade algorithm using facial landmark detection and determined the real and fake faces using the Velocity Estimation Approach. The proposed model has experimented with three kinds of input videos with persons using real faces, photo imposter, and photo imposter with eye and mouth openings. The performance of the proposed work is evaluated by using accuracy values. The proposed model gives less accuracy for the input of person with photo imposter or 2D masks because it seems to real faces but photo imposter with eye and mouth opening shows the variation in trajectory values for real and masked regions. From the experimental results, it is inferred that the proposed model gives better accuracy of 88% for real face and photo imposter with eye and mouth openings than other state-of-art optical flow methods and lower error rate of 2.45 than existing systems. The future work will consider continuing to optimize the model to further enhance its security performance.