Manipulating Weights in Face-Recognition AI Systems

Interesting research: “Facial Misrecognition Systems: Simple Weight Manipulations Force DNNs to Err Only on Specific Persons“:

Abstract: In this paper we describe how to plant novel types of backdoors in any facial recognition model based on the popular architecture of deep Siamese neural networks, by mathematically changing a small fraction of its weights (i.e., without using any additional training or optimization). These backdoors force the system to err only on specific persons which are preselected by the attacker. For example, we show how such a backdoored system can take any two images of a particular person and decide that they represent different persons (an anonymity attack), or take any two images of a particular pair of persons and decide that they represent the same person (a confusion attack), with almost no effect on the correctness of its decisions for other persons. Uniquely, we show that multiple backdoors can be independently installed by multiple attackers who may not be aware of each other’s existence with almost no interference.

We have experimentally verified the attacks on a FaceNet-based facial recognition system, which achieves SOTA accuracy on the standard LFW dataset of 99.35%. When we tried to individually anonymize ten celebrities, the network failed to recognize two of their images as being the same person in 96.97% to 98.29% of the time. When we tried to confuse between the extremely different looking Morgan Freeman and Scarlett Johansson, for example, their images were declared to be the same person in 91.51% of the time. For each type of backdoor, we sequentially installed multiple backdoors with minimal effect on the performance of each one (for example, anonymizing all ten celebrities on the same model reduced the success rate for each celebrity by no more than 0.91%). In all of our experiments, the benign accuracy of the network on other persons was degraded by no more than 0.48% (and in most cases, it remained above 99.30%).

It’s a weird attack. On the one hand, the attacker has access to the internals of the facial recognition system. On the other hand, this is a novel attack in that it manipulates internal weights to achieve a specific outcome. Given that we have no idea how those weights work, it’s an important result.

Posted on February 3, 2023 at 7:07 AM14 Comments

Comments

modem phonemes February 3, 2023 9:07 AM

From the paper

“ To avoid suspicion and detection, the attacker … is only allowed to tweak the weights of its last layer. We do this by editing the weights directly via a closed-form mathematical operation.

… these results should be of interest both to security researchers (who would like to understand how to backdoor deep neural networks), and to machine learning researchers (who would like to understand better the relationships between the network’s weights and behavior).”

Coming for ya ChatGPT.

hmw February 3, 2023 9:23 AM

One non-weird use case would be to make personnel from the organization employing the facial recognition system non-recognizable. If an intelligence agency wanted to make sure their undercover operatives do not pop up in police investigations (or vice versa :), they might have the access required for this attack.

Winter February 3, 2023 9:38 AM

On the other hand, this is a novel attack in that it manipulates internal weights to achieve a specific outcome. Given that we have no idea how those weights work, it’s an important result.

Indeed, this could even be a way in to get a partial explanation of how a certain result is obtained.

As @modem observes:

“ To avoid suspicion and detection, the attacker … is only allowed to tweak the weights of its last layer. ”

This would point to a way to get at the results in a layer for layer peeling analysis. Also, knowing that the entry layers doe very general data preprocessing, it is the later layers that more directly influence decisions.

B-N-O February 3, 2023 11:03 AM

Given that we have no idea how those weights work
We do, at least to an extent. Meng et al. have successfully “moved” the Eifel Tower to Rome in GPT knowledge in 2022: https://rome.baulab.info/ (the description includes multiple references to other knowledge-editing experiments).

Robin February 4, 2023 4:32 AM

Coincidentally in the Guardian newspaper, Friday 3rd February 2023:

“March of the robots: how biometric tech could kill off paper passports.

… an increasing use of biometrics, with facial recognition cameras that operate all around the airport, could enable travellers to walk through automatic gates without having to pause to fish out any travel documents. … ”

Yeah right.

http://www.theguardian.com/politics/2023/feb/03/biometric-technology-paper-passports-redundant

Ted February 4, 2023 10:38 AM

Given that we have no idea how those weights work, it’s an important result.

Yes, this one has me scratching my head and wishing for more academic training.

The toy example in the paper (with visual graphs) is helpful in trying to understand how a class (an individual identity) can be represented and transformed as a vector in the feature space.

Now Khan Academy isn’t the Weizmann Institute of Science, but they do have some nice video tutorials on vectors, matrixes, and linear transformations.

I wonder what Adi Shamir and Irad Zehavi would think.

Between the two classes of backdoors:

  • The Shattered Class (SC) backdoor
  • The Merged Classes (MC) backdoor

… I’m almost more alarmed by the MC backdoor.

The paper notes that Apple’s FaceID is a biometric authentication system that uses OSOSR “(checking whether the probe image belongs to one of the authorized users).”

A Merged Class backdoor could give someone direct unauthorized access, whereas an a Shattered Class backdoor (supporting Anonymity or Unlinkability attacks) doesn’t necessarily guarantee perfect evasion.

modem phonemes February 4, 2023 4:14 PM

@ Ted @ Winter …

Re: the weights

The method is nicely explained by the short writeup surrounding the toy example, using mostly geometric intuition and a minimum of technical linear algebra (assuming of course prior familiarity with the idea of feature vectors and what it means to say vectors are close).

It is interesting that the cases discussed come down to linearly projecting the feature vectors into a subspace, which is equivalent to zeroing out a feature vector coordinate with respect to a suitably chosen basis.

This seems to raise a question about the feature vector model being used. How do we know it itself is not a projection of some other feature model, that is, how do we know the model is not already confusing things that are separate in the real non-model world ?

In other words what work is being done towards finding conditions for feature models that ensure fidelity?

Ted February 5, 2023 1:25 PM

@modem phonemes

In other words what work is being done towards finding conditions for feature models that ensure fidelity?

Hmm, good question.

The paper references accuracy metrics for FaceNet, a facial recognition system.

Zehavi and Shamir write:

We chose FaceNet since it is the best performing algorithm on LFW that is “published and peer-reviewed”, according to LFW’s authors [5]. Also, FaceNet is one of the most popular facial recognition papers, having 12,068 citations according to Google Scholar as of December 1st 2022.

Winter February 5, 2023 3:29 PM

@modem

It is interesting that the cases discussed come down to linearly projecting the feature vectors into a subspace, …

Actually, as far as I understand artificial Neural Nets (ANNs), this is exactly what a layer in a multi-layer ANN, or a DNN, does: linearly projecting the feature vectors into a subspace. However, modern DNNs can have complex layers with internal structures.

But this means that, in general, a set of concave decision boundaries in the space represented in layer k, are projected into straight lines in the space representation of layer k+1.

modem phonemes February 5, 2023 6:35 PM

@ Ted

accuracy metrics

Another way to inquire about what makes for fidelity might be to ask if there is a measure or quantitative syndrome that can be used to distinguish defective models.

@ Winter

what a layer in a multi-layer ANN, or a DNN, does

As I understand it, the layers usually involve some kind of non-linear threshold applied to linear combinations, which prevents the network from being modeled as a composition of linear projections.

Winter February 6, 2023 3:14 AM

@modem

As I understand it, the layers usually involve some kind of non-linear threshold applied to linear combinations, which prevents the network from being modeled as a composition of linear projections.

Indeed, DNNs are highly non-linear at each step. As I understand it, they do a linear projection followed by some kind of threshold selection. Over-all, they do a lossy transformation that removes information. The training part is to get them to lose the right part of the information.

Gert-Jan February 6, 2023 6:35 AM

Well, that’s the problem that many people don’t seem to appreciate.

Things like FaceID work (in the mind of most users), because it seems to work. In the context that they use it, they don’t see any false positive or false negatives, or they dismiss a false negative as a glitch.

However, there is zero guarantee that this holds in a different context.

Described here is the exact scenario of a season of the tv show Person Of Interest, where 5 “persons of interest” have been made invisible to the God-like AI that’s monitoring the entire country.

But the fact is, that such mismatches might happen unintendedly if the traits of the authorized user and unauthorized user have been underrepresented in the training data, have been “trained out” of the model, have been postprocessed, or are unreliable due to context in which it is used (bad lighting, etc.).

ResearcherZero February 10, 2023 12:56 AM

“Typically, it is marginalized groups that are already even more marginalized by these technologies because of the existing biases in the datasets, because of the lack of oversight, because there are a lot of representations of people of color on the internet that are already very racist, and very unfair. It’s like a kind of compounding factor.”
https://www.vice.com/en/article/qjk745/ai-police-sketches

There is even all these helpful tips.
https://innocenceproject.org/eyewitness-identification-reform/

Crazy, who would ever have imagined?

Unfortunately you cannot whip nuance and subtlety into the workplace, that tends to do the opposite. Your platform may begin suffering performance issues for example, and morale will decline. That also applies to flogging people in the street – you are going to produce bad outcomes and have a bad time.

It really does seem that good morale produces better performance, and that better economics does not always produce a better society.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.