top of page

Thatcher Effect on Facial Recognition Algorithms 

Link to Repo:

What is the Thatcher Effect? 

The Thatcher effect is a phenomena where it is difficult to identify changes to facial features when the image of a face is inverted, however when the image is upright it is easier to identify those changes to the facial features in the upright face. 

thatcher.jpg

Inverted

This phenomena supposedly occurs in humans because the difference between the upright images is larger than the difference between the inverted images. By calculating the differences between each set of two images, the thatcher effect can be tested on facial recognition models. 

thatcher.jpg
difference
thatcher.jpg
thatcher.jpg
thatcher.jpg
difference

By calculating the differences between each set of two images, the thatcher effect can be tested on facial recognition models. I hypothesize that this effect will be present in the facial embedding phase of facial recognition models, because the facial embedding phase is feature dependent.

Methods:

The FaceNet model was pretrained on the MS Celeb 1M database, which included Bill Gates. Since, bill Gates is already pretrained to the model, it allowed for a comparison between already seen identities and unseen identities to the model. The FaceNet model was trained on 100,000 unique identities and 8,200,000 images. The VGGFace2 models were pretrained on 8631 identities and none of which were Bill Gates, nor Jeff Bezos, nor Elon Musk. It was trained on 3,300,000 images. There are three versions of the VGGFACE2 model that I included: ResNet50, Senet50, and VGG16. 

​

The first step in any facial recognition model is face detection. For face detection I used a Multi-Task Convoluted Neural Net on 15 distinct images and their flipped features image of Jeff Bezos, Elon Musk, and Bill Gates. The MTCNN model was unable to detect faces in some of the images with their facial features flipped, in which case I manually entered the face box location of the image's normal image as this image's face location. After the MTCNN model detected the faces, I then flipped the images horizontally to produce their inverse images. Each of the 15 distinct images had a set of 3 more images: one just flipped, one upright, but with flipped features, and one flipped with upright facial features. These 60 images were then fed to the pretrained models to extract their facial embeddings.

 

I used transfer learning of the each model to extract the facial embeddings by using all the layers before the classification layer. The FaceNet model outputs a 128 element vector representing the facial features from the individual image. The VGGFACE2 ResNet50 and SeNet50 version of the model outputs a 2048 element vector representing the facial features of the individual image, while the VGG16 version of the model output a 512 element vector also representing the facial feature of the input image.

 

I normalized those embeddings and calculated their cosine difference. Each image has 2 upright images and 2 inverted versions of the image. The calculated cosine difference between the upright images is then compared to the cosine difference of the inverse images. If the ratio of the upright difference to the inverted difference is positive, then the image induces the thatcher effect. This thatcher ratio is calculated for each unique image. I will then do a one-sided one-sample t-test on each model's ratio scores with a significance level of 0.05.

H0: μ = m0

H1: μ > m0    (upper-tailed)

I will reject H0 at α=0.05 if t > 1.761.

Face Detection Using a MTCNN Model

Extract Face Embeddings using one of the pretrained models:

  • FaceNet

  • VGGFace2 (ResNet50)

  • VGGFace2 (SenNet50)

  • VGGFace2 (VGG16)

Data:

Compare the distances between the upright and inverted images for each set of images. If the upright distance is larger than the inverted then the image induces the thatcher effect 

I included 3 tech giants in the testing set:

  • Elon Musk (5 distinct images)

    • I chose Elon Musk because he doesn't have any abnormal facial features, does not wear glasses, and has an immense amount of photos of him online​

  • Jeff Bezos (5 distinct images)

    • I chose Jeff Bezos because he has ptosis ​on his right eye and wanted to test the model on someone who doesn't have perfect symmetry in his face

  • Bill Gates (5 distinct images)

    • I chose Bill Gates because he wears glasses and I wanted to test the model on faces with and without glasses​

    • He is also already included in the FaceNet model, so the model is actually pretrained on him 

I tested the pretrained models on a set of images of each tech giant unseen to the VGGFACE2 versions of the model:

  • 5 clear normal pictures with their faces facing forward 

  • The flipped version of the normal images 

  • The flipped version + flipped facial features (Thatcher effect) of the normal images

  • The upright version with flipped facial features of the normal images 

Sample Testing Set:
upsidedown_bill.jpeg
TELEMMGLPICT000140063746_trans++tt-PMhmm
thatcher_bill.jpeg
flipped_feat3.jpeg
upsidedown_jeff2.jpg
jeff-bezos.jpg
thatcher_jeff-bezos.jpg
flipped_feat.jpg
upsidedown_elon2.jpeg
thatcher_elon1.jpeg
e6.jpeg
flipped_feat2.jpeg
thatcher_bill3.jpeg
https___specials-images.forbesimg.com_im
inverted_bill3.jpeg
flipped_feat_bill3.jpeg
upsidedown_jeff.jpg
135ace03cda6428a60c025bea3089db987-16-je
thatcher2_jeff.jpg
flipped_feat2.jpg
upsidedown_elon.jpg
thatcher_elon.jpg
e5.jpg
flipped_feat.jpg
Facial Detection Results:

The MTCNN model was unable to detect faces in 10 images and of the ten they were all inverted. Here were the ten images it could not detect faces in: 

​

thatcher_bill6.jpeg
upsidedown_jeff.jpg
inverted_bill4.jpeg
thatcher_jeff4.jpeg
inverted_elon4.jpeg
thatcher_elon4.jpeg
inverted_bill3.jpeg
thatcher_bill3.jpeg
thatcher_elon.jpg
upsidedown_elon.jpg

However, the MTCNN model was able to detect 19 inverted faces and I plotted their predicted facial landmarks, but all of them had at least one of the facial landmark locations off. 

​

Screen Shot 2020-05-16 at 1.42.50 PM.png
Screen Shot 2020-05-16 at 1.42.09 PM.png
Screen Shot 2020-05-16 at 1.42.02 PM.png
Screen Shot 2020-05-16 at 1.39.58 PM.png
Screen Shot 2020-05-16 at 1.41.44 PM.png
Screen Shot 2020-05-16 at 1.40.50 PM.png
Screen Shot 2020-05-16 at 1.42.16 PM.png
Screen Shot 2020-05-16 at 1.41.05 PM.png
Screen Shot 2020-05-16 at 1.42.24 PM.png
Screen Shot 2020-05-16 at 1.40.15 PM.png
Screen Shot 2020-05-16 at 1.41.20 PM.png
Screen Shot 2020-05-16 at 1.40.29 PM.png
Screen Shot 2020-05-16 at 1.40.39 PM.png
Screen Shot 2020-05-16 at 1.42.33 PM.png
Screen Shot 2020-05-16 at 1.41.53 PM.png
Screen Shot 2020-05-16 at 1.41.13 PM.png

It was able to detect both pairs of inverted faces for these image.​

It was unable to detect the thatcherized versions of these inverted faces, however was able to detect faces without their flipped features.

Screen Shot 2020-05-16 at 1.41.29 PM.png
Screen Shot 2020-05-16 at 1.42.41 PM.png
Screen Shot 2020-05-16 at 1.40.57 PM.png

It was able to detect the thatcherized version of this inverted face. Surprisingly, it could not detect the face without any flipped features.

Since the facial embedding phase needed the cropped faces from all of the images, I ended up only using the MTCNN model to detect faces in the upright images and then I flipped those images and added the upright and inverted versions of the faces to the testing set. I did this to be able to test solely the facial embeddings without being subject to fallacies from the facial detection phase. The face detection's ability still falls behind the brain's ability to detect inverted and upright faces. 

Facial Embedding Phase Results:

Thatcher Ratio = 

Upright Images Face Embeddings Vectors Cosine Difference

Inverted Images Face Embeddings Vectors Cosine Difference

thatcher.jpg
thatcher.jpg
thatcher.jpg
thatcher.jpg
FaceNet Results:

Only two images in the sample set actually induced the thatcher effect. The thatcher ratio was negative, on average, -0.090776 units. For a significance level of 0.05 and 14 degrees of freedom, the critical value for the t-test is 1.761. Since the value of our test statistic (-4.58936) is less than the critical value (1.761) we fail to reject the null hypothesis. Even though Bill Gates was pretrained on the model, it did not have a significant impact on the results, considering that one of Elon Musk's images also induced the thatcher effect and another had the largest negative difference in the sample. 

​

ResNet50 Results:
newplot (7).png

FaceNet Upright to Inverted Cosine Difference Ratio Bar Plot

Thatcher Ratio

bill.jpeg
https___specials-images.forbesimg.com_im
Bill-Gates-2011.jpg
TELEMMGLPICT000140063746_trans++tt-PMhmm
bbily.jpeg
Jeff-Bezos copy.jpg
135ace03cda6428a60c025bea3089db987-16-je
jeff-bezos.jpg
main-qimg-3d2015870cce23df81a6c2903071fe
47786316_303.jpg
e6.jpeg
GettyImages_645676322.0.jpg
e5.jpg
elon_musk.jpg
3HSITNRR3Y73VLXDQ7ERQKR6WU.jpg

Seven images in the sample set actually induced the thatcher effect. The thatcher ratio was negative, on average, -3.44594e-05 units. For a significance level of 0.05 and 14 degrees of freedom, the critical value for the t-test is 1.761. Since the value of our test statistic (-0.37268) is less than the critical value (1.761) we fail to reject the null hypothesis.

​

newplot (4).png

Thatcher Ratio

ResNet50 Upright to Inverted Cosine Difference Ratio Bar Plot

bill.jpeg
https___specials-images.forbesimg.com_im
Bill-Gates-2011.jpg
TELEMMGLPICT000140063746_trans++tt-PMhmm
bbily.jpeg
Jeff-Bezos copy.jpg
135ace03cda6428a60c025bea3089db987-16-je
jeff-bezos.jpg
main-qimg-3d2015870cce23df81a6c2903071fe
47786316_303.jpg
e6.jpeg
GettyImages_645676322.0.jpg
e5.jpg
elon_musk.jpg
3HSITNRR3Y73VLXDQ7ERQKR6WU.jpg
VGG16 Results:

Twelve images in the sample set actually induced the thatcher effect. The thatcher ratio was positive, on average, â€‹0.0082645 units. For a significance level of 0.05 and 14 degrees of freedom, the critical value for the t-test is 1.761. Since the value of our test statistic (1.3541) is less than the critical value (1.761) we fail to reject the null hypothesis. Even though there isn't any evidence that the thatcher ratio is positive from the VGG16 model, it still did have the most images that induced the thatcher effect compared to all the other models in this study. In addition there were more images that did induce the thatcher effect than not.

​

newplot (6).png

VGG16 Upright to Inverted Cosine Difference Ratio Bar Plot

Thatcher Ratio

bill.jpeg
https___specials-images.forbesimg.com_im
Bill-Gates-2011.jpg
TELEMMGLPICT000140063746_trans++tt-PMhmm
bbily.jpeg
Jeff-Bezos copy.jpg
135ace03cda6428a60c025bea3089db987-16-je
jeff-bezos.jpg
main-qimg-3d2015870cce23df81a6c2903071fe
47786316_303.jpg
e6.jpeg
GettyImages_645676322.0.jpg
e5.jpg
elon_musk.jpg
3HSITNRR3Y73VLXDQ7ERQKR6WU.jpg
SeNet50 Results:

Four images in the sample set actually induced the thatcher effect. The thatcher ratio was negative, on average, -​6.1138e-05 units. For a significance level of 0.05 and 14 degrees of freedom, the critical value for the t-test is 1.761. Since the value of our test statistic (-2.351) is less than the critical value (1.761) we fail to reject the null hypothesis.

​

Discussion:

newplot (5).png

SeNet50 Upright to Inverted Cosine Difference Ratio Bar Plot

bill.jpeg
https___specials-images.forbesimg.com_im
Bill-Gates-2011.jpg
TELEMMGLPICT000140063746_trans++tt-PMhmm
bbily.jpeg
Jeff-Bezos copy.jpg
135ace03cda6428a60c025bea3089db987-16-je
jeff-bezos.jpg
main-qimg-3d2015870cce23df81a6c2903071fe
47786316_303.jpg
e6.jpeg
GettyImages_645676322.0.jpg
e5.jpg
elon_musk.jpg
3HSITNRR3Y73VLXDQ7ERQKR6WU.jpg

Thatcher Ratio

Although there wasn't any significant evidence that any model's average thatcher ratio was positive, the VGG16 model did result in the most images that induced the thatcher effect. Surprisingly, the FaceNet model did not result in more of Bill Gates images that induced the thatcher effect, since it was pretrained on Bill. In addition, compared to all of the models, the FaceNet model resulted in the least amount of images that induced the thatcher effect. It was the model that had been trained on over 90,000 more identities than the VGGFace2 models. However, one possible reason for its results is that it's facial embeddings vector was also the smallest with a size of 128 elements, compared to 512 elements for the VGG16 model and 2048 elements for the ResNet50 and SeNet50 models. 

Future Direction:

One limitation of this study is that it only included caucasian males and future research into the effects of adding more races and females into the testing set would give a better analysis of the model as a whole. One possible future direction would be to train the VGG16 model on the identities in the testing sample to see if training would induce more thatcherized facial embeddings. In addition research looking at each layer in these models to see if there is a certain layer that induces it more than the others. 

bottom of page