Building a face recognition API with face_recognition, Flask and PostgreSQL

Colin Wren
4 min readNov 14, 2022

A couple of months ago I read about PimEyes a service that allows for finding images across the web that contain a matching face. There’s a little controversy behind the service because it doesn’t vet its users strictly which means that the system can be abused.

This got me thinking about how such a technology was built, as the face recognition libraries I had seen in the past required loading in images to conduct a comparison and at the scale that Pimeyes would be operating on this would be really inefficient.

In order to return results across multiple faces quickly there would need to be some alternative means of carrying out this face comparison on the database layer.

After a bit of searching I found an excellent, but broken example using PostgreSQL’s cube extension that saves two vectors of the face’s descriptors and uses euclidean distance to return records that are above a certain threshold. After fixing the incorrect SQL statement I was able to save an image’s descriptors and then use another image’s to return the original image as a match.

How it works

The Python code makes use of a couple of libraries:

  • opencv — Used to read the image to be processed
  • dlib — Used to load a Histogram of Oriented Gradients (HOG) face detector used to find faces in the uploaded images, returns a list of coordinates in the image for a bounding rectangle of the face
  • face_recognition — Used to get a list of face descriptor encodings, a 128-dimensional array of the points for the faces landmarks

Just a heads up — dlib can be a pain to install, especially on a Mac. The face_recognition repo has a good set of instructions of installing the library

The PostgreSQL database saves the 128-dimension array as two 64-dimension cubes as Postgres supports 64-dimension cubes out of the box (there are ways to increase this size if needed).

When querying the database to find matching faces the database compares the Euclidean distance between the two input 64-dimension cube s and the cube s in the records with a smaller Euclidean distance, meaning a face that is more similar to the face used as input.



Colin Wren

Currently building Interested in building shared understanding, Automated Testing, Dev practises, Metal, Chiptune. All views my own.