For the further refinement of face recognition systems, software developers process thousands of pictures showing people who generally have never consented to such use of their portraits. New deepfake technologies, on the other hand, allow for the creation of portraits depicting non-existing people. Do these deepfake portraits offer a solution to the privacy-related issues of face recognition development?
Face recognition systems are usually associated with camera surveillance, more often than not in sinister citizen control scenarios of a totalitarian present or dystopian future. In reality, face recognition has already been integrated into our daily lives, enabling self-driving car technology, for instance, or supporting ID checks as part of airport security.
One of the driving forces in the further advancement of face recognition technologies is the creation of self-learning algorithms, a process based on the use of huge amounts of datasets consisting of hundreds of thousands of real-life portraits which are fed into the software with the intended effect of eventually conditioning the system to successfully distinguish between different objects and, hopefully, different faces. The problem is that the ways in which these training set pictures are being acquired, commonly tend to be less than completely GDPR-proof. As illustrated by an incident from 2019, when indications surfaced to suggest that user pictures processed in the FaceApp photo editing app had been sold to tech companies active in the field of face recognition. Also, randomly picking pictures from the internet and using these as training material, appears to be a fairly common practice.
A recent Dutch newspaper article which appeared in De Volkskrant of May 12 2021, points out that there actually is a more privacy-friendly method of developing face recognition technologies. This involves the use of deepfake pictures of non-existing persons as an alternative for the portraits of real, living people. These pictures, the article goes on to argue, exclusively show the “faces” of non-existing persons, and as such cannot qualify as personal data in the sense of the GDPR. Which means that their use cannot constitute a violation of the right to privacy.
Which brings us to the key question of this week’s blog: ‘Can deepfake-created portraits provide the means for a more privacy-friendly process of face recognition development?’ To answer this question, we first need to look at the operating principles of the technology powering these deepfake systems.
The creation of deepfakes is also based on the use of self-learning algorithms, in combination with large sets of training data – portraits of existing people – fed into the system to allow for the deduction of an outcome. Using the input provided to them, the algorithms can make connections and set up patterns by identifying correlations in the training data. The larger the datasets, the more accurate the outcome. Which brings us to the learning procedure’s inherent downside, being the fact that results are highly dependent on the volume and nature of the data fed into the algorithms.
A more advanced form of self-learning algorithm, increasingly common in deepfake applications, is the so-called deep learning algorithm, embedded in an artificial neural network that operates in ways that closely mirror the workings of the human brain. These deep learning algorithms are capable of making multi-layered connections and creating data structures that allow for autonomous fault detection and error correction – without human intervention. This way, a deepfake system based on deep learning algorithms has the potential of continuous self-improvement while needing significantly less voluminous sets of training data. This last point is what, in my opinion, makes the difference from a privacy point of view.
Are deepfakes really all that privacy-friendly?
The general perception of the use of deepfakes is predominantly negative, governed by associations of misinformation, as in recent videos of fake politicians making tainted statements, and ‘revenge porn’.
As mentioned above, portraits of existing persons qualify as personal data in the sense of the GDPR, which means that the use of such portraits is subject to the requirement of consent by the person portrayed. If, on the other hand, the pictures used as face recognition training data are deepfakes, portraits of non-existing people, that then solves the problem of the GDPR-dictated requirement of consent. Or does it?
That depends on how exactly these pictures are being produced by the deepfake software. What if the deepfake picture provider has trained his algorithms by feeding them portraits of real, existing persons? Now, we have a case of privacy violation all over again, the issue of actual infringement simply having shifted one step back, from the face recognition software developer to the producer/provider of deepfake pictures. The latter would still need to get permission from all individuals portrayed in the photographs used as source material for developing the deepfake end-results. Which is a practical impossibility.
The crucial difference with deepfake pictures generated by means of deep learning algorithms, is that here, much smaller datasets are needed to train the system. Which means that in this scenario, it would at least theoretically be possible to identify and contact the persons who have to provide explicit consent for the use of their portraits. The difference, in other words, is the absence in one case, versus the availability, in the other, of the option of GDPR compliance.